Launching a Democratization of Data Science

February 9, 2012, 4:57 pm

≫ Next: The Personal Analytics of My Life

It’s a sad but true fact that most data that’s generated or collected—even with considerable effort—never gets any kind of serious analysis. But in a sense that’s not surprising. Because doing data science has always been hard. And even expert data scientists usually have to spend lots of time wrangling code and data to do any particular analysis.

I myself have been using computers to work with data for more than a third of a century. And over that time my tools and methods have gradually evolved. But this week—with the release of Wolfram|Alpha Pro—something dramatic has happened, that will forever change the way I approach data.

The key idea is automation. The concept in Wolfram|Alpha Pro is that I should just be able to take my data in whatever raw form it arrives, and throw it into Wolfram|Alpha Pro. And then Wolfram|Alpha Pro should automatically do a whole bunch of analysis, and then give me a well-organized report about my data. And if my data isn’t too large, this should all happen in a few seconds.

And what’s amazing to me is that it actually works. I’ve got all kinds of data lying around: measurements, business reports, personal analytics, whatever. And I’ve been feeding it into Wolfram|Alpha Pro. And Wolfram|Alpha Pro has been showing me visualizations and coming up with analyses that tell me all kinds of useful things about the data.

Data input

In the past, when I’d really been motivated, I’d take some data here or there, read it into Mathematica, and use some of the powerful tools there to do some analysis or another. But what’s new and exciting with Wolfram|Alpha Pro is that it is all so automatic. On a whim I can throw my data in, and expect to see something useful come out.

The basic idea is very much in line with the whole core mission of Wolfram|Alpha: to take expert-level knowledge, and create a system that can apply it automatically whenever and wherever it’s needed. Here the expert-level knowledge is the collection of methods that a team of good data scientists would have, and what Wolfram|Alpha Pro does is to take that knowledge and use it to analyze whatever data you feed in.

There are many challenges, and we’re still at any early stage in addressing all of them. But with the whole Wolfram|Alpha technology stack, as well as with the underlying Mathematica language, we were able to start from a very strong foundation. And in the course of building Wolfram|Alpha Pro we’ve invented all kinds of new methods.

There are several pieces to the whole problem. The first is just to get the data into Wolfram|Alpha in any kind of well-structured form. And as anyone who’s actually worked with real data knows, that’s often not as easy as it sounds.

You think you’ve got data that’s arranged in columns. But what about those weird separators? What about those headers? What about those delimiters that occur inside data elements? What about those missing elements? What about those lines that were stripped when copying from a browser? What about that second table in the same spreadsheet? And so on.

It’s a little like what Wolfram|Alpha has to do in understanding free-form natural language, with all its variations and redundancies. But the grammar for structured data is different, and in some ways less forgiving. And just as in the original development of Wolfram|Alpha, what we’ve done is to take a large corpus of examples, and try to deduce the appropriate grammar from what we see—with the knowledge that as we get large volumes of actual queries, we’ll gradually be able to improve this. (Needless to say, we use the analysis capabilities of Wolfram|Alpha Pro itself to do much of this analysis.)

OK, so we’ve figured out where the individual elements in our data are. Now we have to figure out what they are. And here’s where Wolfram|Alpha’s linguistic prowess is crucial. Because it immediately allows us to understand all those weird formats for numbers and dates and so on. And more than that, it lets us recognize units and place names and lots of other things, and automatically put them into a standard computable form.

Sometimes in ordinary Wolfram|Alpha, when there’s a date or unit or place that’s given in the input, it can be ambiguous. But when it’s fed whole columns of data, Wolfram|Alpha Pro can usually automatically resolve these ambiguities (“All dates are probably US style”; “those units are probably all temperature units”; etc.).

So let’s say that Wolfram|Alpha Pro knows what all the elements in a table of data are—what their “values” are. Then it has to start figuring out what they “mean”. Does that sequence of numbers represent some kind of labels or coordinates? Or is it just samples from a random distribution? Does that sequence of currency values represent an asset price with random-walk-like variations? Or is it just a sequence of unrelated currency amounts? Are both those columns actually primary data, or is one of them just the rankings for the other? Etc. etc.

Wolfram|Alpha Pro has a large number of algorithms and heuristics for trying to deduce what the data it’s given represents. And this immediately puts it on track to see what kind of visualizations and analyses it should do.

There are always tricky issues. When does it make sense to join points in a 2D plot? When should one use bar charts versus scatter plots versus pie charts, etc.? What plots have scales that are close enough to combine? How should one set up regression analysis: what variables should one try to predict? And so on.

Wolfram|Alpha Pro inherits from Mathematica many standard kinds of statistical analysis. But what it does is to completely automate these. Sometimes it chooses what kind of analysis makes sense based on looking at the data. But often it will just run a fair number of possible analyses in parallel, then report only the ones that make sense.

At some level, a key objective of Wolfram|Alpha Pro is to be able to take any set of data, and be able to “tell a story” from it. Be able to show what’s interesting or unusual about the data, and what conclusions can be drawn from it.

One example is fits. Given data, Wolfram|Alpha Pro will typically try a large number of different kinds of functional forms. Straight lines. Polynomials. Exponentials. Logistic curves. Sine curves. And so on. And then it has criteria for deciding which, if any, of these represent a reasonable fit to the original data.

Wolfram|Alpha Pro does the same kind of thing for probability distributions. It also uses all kinds of statistical methods to be able to make statistical conclusions, exclude statistical hypotheses or not, and so on.

Things get even more interesting when the data it’s dealing with doesn’t just consist of numbers.

If it’s given, say, dates and currency values, it can figure out things like currency conversions, and inflation adjustments. If it’s given places, it can plot them on a map, but it can also normalize by properties of a place (like population or area). And if it’s given arbitrary objects with the right level of repetition, it’ll treat them as nodes in a network.

For any given data that’s been input, Wolfram|Alpha Pro usually has a very large number of analyses it can run. But the challenge then is to prune, combine and organize the results to emphasize what is important, and to make them as easy for a human to assimilate as possible—appropriately adding textual summaries that are rigorous but understandable to non-experts.

Usually what will happen is that Wolfram|Alpha Pro will give an overall summary as its “default report”, and then have all sorts of buttons and pulldowns that allow drill-down to many variations or details.

In my many years of working with data, I’ve probably at some time or another generated at least a few of most of the kinds of plots, tables and analyses that Wolfram|Alpha Pro shows. But I’m quite certain that in any particular case, I’ve never generated more than a small fraction of what Wolfram|Alpha Pro would produce.

And the important thing is that by automatically generating a whole report with carefully chosen entries, Wolfram|Alpha Pro gives me something where at a glance I can start to understand what’s in my data.

Any particular part of the result, I could no doubt reproduce, with sufficient time spent wrangling code and data. But the whole point is that as a practical matter, I would only end up doing it if I pretty much knew what I was looking for. It just takes too much time to do it “on a whim”, for purely exploratory purposes.

But Wolfram|Alpha Pro changes all of this. Because for the first time, it makes it immediate to get a whole report on any data I have. And what this means is that in practice I’ll actually end up doing this. As is so often the case, a sufficiently large “quantitative” change in how easy it is to do something leads to a qualitative change in what we’ll in practice do.

Now, needless to say, the version of Wolfram|Alpha Pro that arrived this week is just the beginning. There are plenty of additional analyses to include, and plenty of new types of data with special characteristics to handle.

And right now, Wolfram|Alpha Pro is set up just to handle fairly small datasets (thousands of rows, handfuls of columns), where it can generate a meaningful report in a typical “web response time” of a few seconds.

There’s nothing about the architecture or the underlying Mathematica infrastructure, though, that restricts datasets to be this small. And I expect that in the future we’ll be able to handle bigger and bigger datasets using the Wolfram|Alpha Pro technology stack.

But for now I’m just pleased at how easy it’s become to take almost any reasonably small lump of raw data, and use Wolfram|Alpha Pro to start getting meaningful insights from it. It is, I believe, a major democratization of the achievements of data science. And a way that much more of the data that’s generated in the world can be used in meaningful ways.

↧

The Personal Analytics of My Life

March 8, 2012, 8:16 am

≫ Next: Overcoming Artificial Stupidity

≪ Previous: Launching a Democratization of Data Science

One day I’m sure everyone will routinely collect all sorts of data about themselves. But because I’ve been interested in data for a very long time, I started doing this long ago. I actually assumed lots of other people were doing it too, but apparently they were not. And so now I have what is probably one of the world’s largest collections of personal data.

Every day—in an effort at “self awareness”—I have automated systems send me a few emails about the day before. But even though I’ve been accumulating data for years—and always meant to analyze it—I’ve never actually gotten around to doing it. But with Mathematica and the automated data analysis capabilities we just released in Wolfram|Alpha Pro, I thought now would be a good time to finally try taking a look—and to use myself as an experimental subject for studying what one might call “personal analytics”.

Let’s start off talking about email. I have a complete archive of all my email going back to 1989—a year after Mathematica was released, and two years after I founded Wolfram Research. Here’s a plot with a dot showing the time of each of the third of a million emails I’ve sent since 1989:

Plot with a dot showing the time of each of the third of a million pieces of email

The first thing one sees from this plot is that, yes, I’ve been busy. And for more than 20 years, I’ve been sending emails throughout my waking day, albeit with a little dip around dinner time. The big gap each day comes from when I was asleep. And for the last decade, the plot shows I’ve been pretty consistent, going to sleep around 3am ET, and getting up around 11am (yes, I’m something of a night owl). (The stripe in summer 2009 is a trip to Europe.)

But what about the 1990s? Well, that was when I spent a decade as something of a hermit, working very hard on A New Kind of Science. And the plot makes it very clear why in the late 1990s when one of my children was asked for an example of “being nocturnal” they gave me. The rather dramatic discontinuity in 2002 is the moment when A New Kind of Science was finally finished, and I could start leading a different kind of life.

So what about other features of the plot? Some line up with identifiable events and trends in my life, sometimes reflected in my online scrapbook or timeline. Others at first I don’t understand at all—until a quick search of my email archive jogs my memory. It’s very convenient that I can always drill down and read a raw email. Because as with essentially any long-timescale data project, there are all kinds of glitches (here like misformatted email headers, unset computer clocks, and untagged automated mailings) that have to be found and systematically corrected for before one has consistent data to analyze. And before, in this case, I can trust that any dots in the middle of the night are actually times I woke up and sent email (which is nowadays very rare).

The plot above suggests that there’s been a progressive increase in my email volume over the years. One can see that more explicitly if one just plots the total number of emails I’ve sent as a function of time:

Daily outgoing emails and monthly outgoing emails

Again, there are some life trends visible. The gradual decrease in the early 1990s reflects me reducing my involvement in day-to-day management of our company to concentrate on basic science. The increase in the 2000s is me jumping back in, and driving more and more company projects. And the peak in early 2009 reflects with the final preparations for the launch of Wolfram|Alpha. (The individual spikes, including the all-time winner August 27, 2006, are mostly weekend or travel days specifically spent “grinding down” email backlogs.)

Distribution of emails per day The plots above seem to support the idea that “life’s complicated”. But if one aggregates the data a bit, it’s easy to end up with plots that seem like they could just be the result of some simple physics experiment. Like here’s the distribution of the number of emails I’ve sent per day since 1989:

What is this distribution? Is there a simple model for it? I don’t know. Wolfram|Alpha Pro tells us that the best fit it finds is to a geometric distribution. But it officially rejects that fit. Still, at least the tail seems—as so often—to follow a power law. And perhaps that’s telling me something about myself, though I have to say I don’t know what.

Monthly distinct email recipients

The vast majority of these recipients are people or mailgroups within our company. And I suspect the overall growth is a reflection of both the increasing number of people at the company, and the increasing number of projects in which I and our company are involved. The peaks are often associated with intense early-stage projects, where I am directly interacting with lots of people, and there isn’t yet a well-organized management structure in place. I don’t quite understand the recent decrease, considering that the number of projects is at an all-time high. I’m just hoping it reflects better organization and management…

OK, so all of that is about email I’ve sent. What about email I’ve received? Here’s a plot comparing my incoming and outgoing email:

Average daily emails

The peaks in 1996 and 2009 are both associated with the later phases of big projects (Mathematica 3 and the launch of Wolfram|Alpha) where I was watching all sorts of details, often using email-based automated systems.

OK. So email is one kind of data I’ve systematically archived. And there’s a huge amount that can be learned from that. Another kind of data that I’ve been collecting is keystrokes. For many years, I’ve captured every keystroke I’ve typed—now more than 100 million of them:

Diurnal plot of keystrokes

Daily keystrokes, averaged by month

There are all kinds of detailed facts to extract: like that the average fraction of keys I type that are backspaces has consistently been about 7% (I had no idea it was so high!). Or how my habits in using different computers and applications have changed. And looking at the daily totals, I can see spikes of writing activity—typically associated with creating longer documents (including blog posts). But at least at an overall level things like the plots above look similar for keystrokes and email.

What about other measures of activity? My automated systems have been quietly archiving lots of them for years. And for example this shows the times of events that have appeared in my calendar:

Diurnal plot of calendar events

The changes over the years reflect quite directly things going on in my life. Before 2002 I was doing a lot of solitary work, particularly on A New Kind of Science, and having only a few scheduled meetings. But then as I initiated more and more new projects at our company, and took a more and more structured approach to managing them, one can see more and more meetings getting filled in. Though my “family dinner stripe” remains clearly visible.

Here’s a plot of the daily average total number of meetings (and other calendar events) that I’ve done over the years:

Average events per day

The trend is pretty clear. And it reflects the fact that in the past decade or so I’ve gradually learned to work better “in public”, efficiently figuring things out while interacting with groups of people—which I’ve discovered makes me much more effective both at using other people’s expertise and at delegating things that have to be done.

It often surprises people when I tell them this, but since 1991 I’ve been a remote CEO, interacting with my company almost exclusively just by email and phone (usually with screensharing). (No, I don’t find videoconferencing with the company very useful, and the telepresence robot I got recently has mostly been standing idle.)

So phone calls are another source of data for me. And here’s a plot of the times of calls I’ve made (the gray regions are missing data):

Diurnal plot of phone calls

Yes, I spend many hours on the phone each day:

Daily hours on the phone and monthly hours on the phone

And this shows how the probability to find me on the phone varies during the day:

On-phone probability

This is averaged over all days for the last several years, and in fact I’m guessing that the “peak weekday probability” would actually be even higher than 70% if the average excluded days when I’m away for one reason or another.

Here’s another way to look at the data—this shows the probability for calls to start at a given time:

Call start times

There’s a curious pattern of peaks—near hours and half-hours. And of course those occur because many phone calls are scheduled at those times. Which means that if one plots meeting start times and phone call start times one sees a strong correlation:

Calls and meetings

Differences between meeting and phone call start times I was curious just how strong this correlation is: in effect just how scheduled all those calls are. And looking at the data I found that at least for my external phone meetings at least half of them do indeed start within 2 minutes of their appointed times. For internal meetings—which tend to involve more people, and which I normally have scheduled back-to-back—there’s a somewhat broader distribution, shown on the left.

When one looks at the distribution of call durations one sees a kind of “physics-like” background shape, but on top of that there’s the “obviously human” peak at the 1-hour mark, associated with meetings that are scheduled to be an hour long.

So far everything we’ve talked about has measured intellectual activity. But I’ve also got data on physical activity. Like for the past couple of years I’ve been wearing a little digital pedometer that measures every step I take:

Diurnal plot of steps taken

Daily steps averaged by month

And once again, this shows quite a bit of consistency. I take about the same number of steps every day. And many of them are taken in a block early in my day (typically coinciding with the first couple of meetings I do). There’s no mystery to this: years ago I decided I should take some exercise each day, so I set up a computer and phone to use while walking on a treadmill. (Yes, with the correct ergonomic arrangement one can type and use a mouse just fine while walking on a treadmill, at least up to—for me—a speed of about 2.5 mph.)

OK, so let’s put all this together. Here are my “average daily rhythms” for the past decade (or in some cases, slightly less):

Graphs of incoming emails, outgoing emails, keystrokes, meetings and events, calls, and steps as a function of time

The overall pattern is fairly clear. It’s meetings and collaborative work during the day, a dinner-time break, more meetings and collaborative work, and then in the later evening more work on my own. I have to say that looking at all this data I am struck by how shockingly regular many aspects of it are. But in general I am happy to see it. For my consistent experience has been that the more routine I can make the basic practical aspects of my life, the more I am able to be energetic—and spontaneous—about intellectual and other things.

And for me one of the objectives is to have ideas, and hopefully good ones. So can personal analytics help me measure the rate at which that happens?

It might seem very difficult. But as a simple approximation, one can imagine seeing at what rate one starts using new concepts, by looking at when one starts using new words or other linguistic constructs. Inevitably there are tricky issues in identifying genuine new “words” etc. (though for example I have managed to determine that when it comes to ordinary English words, I’ve typed about 33,000 distinct ones in the past decade). If one restricts to a particular domain, things become a bit easier, and here for example is a plot showing when names of what are now Mathematica functions first appeared in my outgoing email:

First email appearance of Mathematica functions

The spike at the beginning is an artifact, reflecting pre-existing functions showing up in my archived email. And the drop at the end reflects the fact that one doesn’t yet know future Mathematica names. But it’s interesting to see elsewhere in the plot little “bursts of creativity”, mostly but not always correlated with important moments in Mathematica history—as well as a general increase in density in recent times.

As a quite different measure of creative progress, here’s a plot of when I modified the text of chapters in A New Kind of Science:

Plot of when chapters were modified in A New Kind of Science

I don’t have data readily at hand from the beginning of the project. And in 1995 and 1996 I continued to do research, but stopped editing text, because I was pulled away to finish Mathematica 3 (and the book about it). But otherwise one sees inexorable progress, as I systematically worked out each chapter and each area of the science. One can see the time it took to write each chapter (Chapter 12 on the Principle of Computational Equivalence took longest, at almost 2 years), and which chapters led to changes in which others. And with enough effort, one could drill down to find out when each discovery was made (it’s easier with modern Mathematica automatic history recording). But in the end—over the course of a decade—from all those individual keystrokes and file modifications there gradually emerged the finished A New Kind of Science.

It’s amazing how much it’s possible to figure out by analyzing the various kinds of data I’ve kept. And in fact, there are many additional kinds of data I haven’t even touched on in this post. I’ve also got years of curated medical test data (as well as my not-yet-very-useful complete genome), GPS location tracks, room-by-room motion sensor data, endless corporate records—and much much more.

And as I think about it all, I suppose my greatest regret is that I did not start collecting more data earlier. I have some backups of my computer filesystems going back to 1980. And if I look at the 1.7 million files in my current filesystem, there’s a kind of archeology one can do, looking at files that haven’t been modified for a long time (the earliest is dated June 29, 1980).

Here’s a plot of the latest modification times of all my current files:

Modification dates of all current files

The colors represent different file types. In the early years, there’s a mixture of plain text files (blue dots) and C language files (green). But gradually there’s a transition to Mathematica files (red)—with a burst of page layout files (orange) from when I was finishing A New Kind of Science. And once again the whole plot is a kind of engram—now of more than 30 years of my computing activities.

So what about things that were never on a computer? It so happens that years ago I also started keeping paper documents, pretty much on the theory that it was easier just to keep everything than to worry about what specifically was worth keeping. And now I’ve got about 230,000 pages of my paper documents scanned, and when possible OCR’ed. And as just one example of the kind of analysis one can do, here’s a plot of the frequency with which different 4-digit “date-like sequences” occur in all these documents:

Occurrence of years in scanned documents

Of course, not all these 4-digit sequences refer to dates (especially for example “2000″)—but many of them do. And from the plot one can see the rather sudden turnaround in my use of paper in 1984—when I turned the corner to digital storage.

What is the future for personal analytics? There is so much that can be done. Some of it will focus on large-scale trends, some of it on identifying specific events or anomalies, and some of it on extracting “stories” from personal data.

And in time I’m looking forward to being able to ask Wolfram|Alpha all sorts of things about my life and times—and have it immediately generate reports about them. Not only being able to act as an adjunct to my personal memory, but also to be able to do automatic computational history—explaining how and why things happened—and then making projections and predictions.

As personal analytics develops, it’s going to give us a whole new dimension to experiencing our lives. At first it all may seem quite nerdy (and certainly as I glance back at this blog post there’s a risk of that). But it won’t be long before it’s clear how incredibly useful it all is—and everyone will be doing it, and wondering how they could have ever gotten by before. And wishing they had started sooner, and hadn’t “lost” their earlier years.

Comment added April 5:

Thanks for all the great comments and suggestions, both here and in separate messages!

I’d like to respond to a few common questions that have been asked:

How can I do the same kind of analysis you did?
Eventually I hope the answer will be very simple: just upload your data to Wolfram|Alpha Pro, and it’ll all be automatic. But for now, you can do it using Mathematica programs. We just posted a blog explaining part of the analysis, and linking to the source for the Mathematica programs that you’ll need. To use them, of course, you’ll still have to get your data into some kind of readable form.

What systems did you use to collect all the data?
Different ones at different times, and on different computer systems. For keystroke data, for example, I used several different keyloggers—mostly rather shadowy pieces of software marketed primarily for surreptitious uses. For the phone call data, all my landline phones have always been connected to our company phone system (originally a PBX, now a VoIP system), so I was able to use its built-in logging capabilities. For email, I had a script set up as part of our company email system back in 1989 that forks off a copy of all my messages, and sends them to an archive. This script has had to be updated quite a few times over the years when we’ve changed email systems.

How does your treadmill setup work?
It’s pretty straightforward. I have a keyboard mounted on a board that attaches to the two side rails of the treadmill. I’ve carefully adjusted the height of the keyboard, and I’ve put a gel strip in front of it, to rest my wrists on. I have the mouse on a little platform at the side of the treadmill. And I have two displays mounted in front of me. I’ve sometimes thought about developing some kind of kit to let other people “computerize” their treadmills… but it’s seemed too far from my usual business. (And when I first had the treadmill set up, I was still a bit embarrassed about my impending middle age, and need for exercise.)

With everything you have going on, do you find time for your family?
Happily, very much so. It’s helped a great deal that I’ve always worked at home, so when I’m not actively in the middle of working, I can spend time with my family. It’s also helped that I’ve been very consistent for a long time in taking an extended dinner break with my family (that’s the 2.5 hour gap visible in the early evening in most of my plots). In the blog, I concentrated on work-related personal analytics; I have quite a lot more that’s family oriented, but I didn’t include this in the blog.

↧

Overcoming Artificial Stupidity

April 17, 2012, 10:55 am

≫ Next: It’s Been 10 Years: What’s Happened with A New Kind of Science?

≪ Previous: The Personal Analytics of My Life

Today marks an important milestone for Wolfram|Alpha, and for computational knowledge in general: for the first time, Wolfram|Alpha is now on average giving complete, successful responses to more than 90% of the queries entered on its website (and with “nearby” interpretations included, the fraction is closer to 95%).

I consider this an impressive achievement—the hard-won result of many years of progressively filling out the knowledge and linguistic capabilities of the system.

The picture below shows how the fraction of successful queries (in green) has increased relative to unsuccessful ones (red) since Wolfram|Alpha was launched in 2009. And from the log scale in the right-hand panel, we can see that there’s been a roughly exponential decrease in the failure rate, with a half-life of around 18 months. It seems to be a kind of Moore’s law for computational knowledge: the net effect of innumerable individual engineering achievements and new ideas is to give exponential improvement.

Wolfram|Alpha query success rate

But to celebrate reaching our 90% query success rate, I thought it’d be fun to take a look at some of what we’ve left behind. Ever since the early days of Wolfram|Alpha, we’ve been keeping a scrapbook of our favorite examples of “artificial stupidity”: places where Wolfram|Alpha gets the wrong idea, and applies its version of “artificial intelligence” to go off in what seems to us humans as a stupid direction.

Here’s an example, captured over a year ago (and now long-since fixed):

guinea pigs

When we typed “guinea pigs”, we probably meant those furry little animals (which for example I once had as a kid). But Wolfram|Alpha somehow got the wrong idea, and thought we were asking about pigs in the country of Guinea, and diligently (if absurdly, in this case) told us that there were 86,431 of those in a 2008 count.

At some level, this wasn’t such a big bug. After all, at the top of the output Wolfram|Alpha perfectly well told us it was assuming “‘guinea’ is a country”, and offered the alternative of taking the input as a “species specification” instead. And indeed, if one tries the query today, the species is the default, and everything is fine, as below. But having the wrong default interpretation a year ago was a simple but quintessential example of artificial stupidity, in which a subtle imperfection can lead to what seems to us laughably stupid behavior.

Here’s what “guinea pigs” does today—a good and sensible result:

Below are some other examples from our scrapbook of artificial stupidity, collected over the past 3 years. I’m happy to say that every single one of these now works nicely; many actually give rather impressive results, which you can see by clicking each image below.

There’s a certain humorous absurdity to many of these examples. In fact, looking at them suggests that this kind of artificial stupidity might actually be a good systematic source of things that we humans find humorous.

But where is the artificial stupidity coming from? And how can we overcome it?

There are two main issues that seem to combine to produce most of the artificial stupidity we see in these scrapbook examples. The first is that Wolfram|Alpha tries too hard to please—valiantly giving a result even if it doesn’t really know what it’s talking about. And the second is that Wolfram|Alpha may simply not know enough—so that it misses the point because it’s completely unaware of some possible meaning for a query.

Curiously enough, these two issues come up all the time for humans too—especially, say, when they’re talking on a bad cellphone connection, and can’t quite hear clearly.

For humans, we don’t yet know the internal story of how these things work. But in Wolfram|Alpha it’s very well defined. It’s millions of lines of Mathematica code, but ultimately what Wolfram|Alpha does is to take the fragment of natural language it’s given as input, and try to map it into some precise symbolic form (in the Mathematica language) that represents in a standard way the meaning of the input—and from which Wolfram|Alpha can compute results.

By now—particularly with data from nearly 3 years of actual usage—Wolfram|Alpha knows an immense amount about the detailed structure and foibles of natural language. And of necessity, it has to go far beyond what’s in any grammar book.

When people type input to Wolfram|Alpha, I think we’re seeing a kind of linguistic representation of undigested thoughts. It’s not a random soup of words (as people might feed a search engine). It has structure—often quite complex—but it has scant respect for the niceties of traditional word order or grammar.

And as far as I am concerned one of the great achievements of Wolfram|Alpha is the creation of a linguistic understanding system that’s robust enough to handle such things, and successfully to convert them to precise computable symbolic expressions.

One can think of any particular symbolic expression as having a certain “basin of attraction” of linguistic forms that will lead to it. Some of these forms may look perfectly reasonable. Others may look odd—but that doesn’t mean they can’t occur in the “stream of consciousness” of actual Wolfram|Alpha queries made by humans.

And usually it won’t hurt anything to allow even very odd forms, with quite bizarre distortions of common language. Because the worst that will happen is that these forms just won’t ever actually get used as input.

But here’s the problem: what if one of those forms overlaps with something with a quite different meaning? If it’s something that Wolfram|Alpha knows about, Wolfram|Alpha’s linguistic understanding system will recognize the clash, and—if all is working properly—will choose the correct meaning.

But what happens if the overlap is with something Wolfram|Alpha doesn’t know about?

In the last scrapbook example above (from 2 years ago) Wolfram|Alpha was asked “what is a plum”. At the time, it didn’t know about fruits that weren’t explicitly plant types. But it did happen to know about a crater on the moon named “Plum”. The linguistic understanding system certainly noticed the indefinite article “a” in front of “plum”. But knowing nothing with the name “plum” other than a moon crater (and erring—at least on the website—in the direction of giving some response rather than none), it will have concluded that the “a” must be some kind of “linguistic noise”, gone for the moon crater meaning, and done something that looks to us quite stupid.

How can Wolfram|Alpha avoid this? The answer is simple: it just has to know more.

One might have thought that doing better at understanding natural language would be about covering a broader range of more grammar-like forms. And certainly this is part of it. But our experience with Wolfram|Alpha is that it is at least as important to add to the knowledgebase of the system.

A lot of artificial stupidity is about failing to have “common sense” about what an input might mean. Within some narrow domain of knowledge an interpretation might seem quite reasonable. But in a more general “common sense” context, the interpretation is obviously absurd. And the point is that as the domains of Wolfram|Alpha knowledge expand, they gradually fill out all the areas that we humans consider common sense, pushing out absurd “artificially stupid” interpretations.

Sometimes Wolfram|Alpha can in a sense overshoot. Consider the query “clever population”. What does it mean? The linguistic construction seems a bit odd, but I’d probably think it was talking about how many clever people there are somewhere. But here’s what Wolfram|Alpha says:

And the point is that Wolfram|Alpha knows something I don’t: that there’s a small city in Missouri named “Clever”. Aha! Now the construction “clever population” makes sense. To people in southwestern Missouri, it would probably always have been obvious. But with typical everyday knowledge and common sense, it’s not. And just like Wolfram|Alpha in the scrapbook examples above, most humans will assume that the query is about something completely different.

There’ve been a number of attempts to create natural-language question-answering systems in the history of work on artificial intelligence. And in terms of immediate user impression, the problem with these systems has usually been not so much a failure to create artificial intelligence but rather the presence of painfully obvious artificial stupidity. In ways much more dramatic than the scrapbook examples above, the system will “grab” a meaning it happens to know about, and robotically insist on using this, even though to a human it will seem stupid.

And what we learn from the Wolfram|Alpha experience is that the problem hasn’t been our failure to discover some particular magic human-thinking-like language understanding algorithm. Rather, it’s in a sense broader and more fundamental: the systems just didn’t know, and couldn’t work out, enough about the world. It’s not good enough to know wonderfully about just some particular domain; you have to cover enough domains at enough depth to achieve common sense about the linguistic forms you see.

I always conceived Wolfram|Alpha as a kind of all-encompassing project. And what’s now clear is that to succeed it’s got to be that way. Solving a part of the problem is not enough.

The fact that as of today we’ve reached a 90% success rate in query understanding is a remarkable achievement—that shows we’re definitely on the right track. And indeed, looking at the Wolfram|Alpha query stream, in many domains we’re definitely at least on a par with typical human query-understanding performance. We’re not in the running for the Turing Test, though: Wolfram|Alpha doesn’t currently do conversational exchanges, but more important, Wolfram|Alpha knows and can compute far too much to pass for a human.

And indeed after all these years perhaps it’s time to upgrade the Turing Test, recognizing that computers should actually be able to do much more than humans. And from the point of view of user experience, probably the single most obvious metric is the banishment of artificial stupidity.

When Wolfram|Alpha was first released, it was quite common to run into artificial stupidity even in casual use. And I for one had no idea how long it would take to overcome it. But now, just 3 years later, I am quite pleased at how far we’ve got. It’s certainly still possible to find artificial stupidity in Wolfram|Alpha (and it’s quite fun to try). But it’s definitely more difficult.

With all the knowledge and computation that we’ve put into Wolfram|Alpha, we’re successfully making Wolfram|Alpha not only smarter but also less stupid. And we’re continuing to progress down the exponential curve toward perfect query understanding.

↧

It’s Been 10 Years: What’s Happened with A New Kind of Science?

May 7, 2012, 10:04 am

≫ Next: Living a Paradigm Shift: Looking Back on Reactions to A New Kind of Science

≪ Previous: Overcoming Artificial Stupidity

(This is the first of a series of posts related to next week’s tenth anniversary of A New Kind of Science.)

On May 14, 2012, it’ll be 10 years since A New Kind of Science (”the NKS book”) was published.

After 20 years of research, and nearly 11 years writing the book, I’d taken most things about as far as I could at that time. And so when the book was finished, I mainly launched myself back into technology development. And inspired by my work on the NKS book, I’m happy to say that I’ve had a very fruitful decade (Mathematica reinvented, CDF, Wolfram|Alpha, etc.).

I’ve been doing little bits of NKS-oriented science here and there (notably at our annual Summer School). But mostly I’ve been busy with other things. And so it’s been other people who’ve been having the fun of moving the science of NKS forward. But almost every day I’ll hear about something that’s been being done with NKS. And as we approach the 10-year mark, I’ve been very curious to try to get at least a slightly more systematic view of what’s been going on.

A place to start is the academic literature, where there’s now an average of slightly over one new paper per day published citing the NKS book—with that number steadily increasing. The papers span all kinds of areas (here identified by journal fields):

Papers published citing the NKS book identified by journal fields

And looking through the list of papers my main response is “Wow—so much stuff”. Some of the early papers seem a bit questionable, but as the decade has gone on, my impression from skimming through papers is that people are really “getting it”—understanding the ideas in the NKS book, and making good and interesting use of them.

There are typically three broad categories of NKS work: pure NKS, applied NKS, and the NKS way of thinking.

Pure NKS is about studying the computational universe as basic science for its own sake—investigating simple programs like cellular automata, seeing what they do, and gradually abstracting general principles. Applied NKS is about taking what one finds in the computational universe, and using it as raw material to create models, technology and other things. And the NKS way of thinking is about taking ideas and principles from NKS—like computational irreducibility or the Principle of Computational Equivalence—and using them as a conceptual framework for thinking about things.

And with these categories, here’s how the academic papers published in different types of journals break down:

Academic papers published in different types of journals

So what are all these papers actually about? Let’s start with the largest group: applied NKS. And among these, a striking feature is the development of models for a dizzying array of systems and phenomena. In traditional science, new models are fairly rare. But in just a decade of applied NKS academic literature, there are already hundreds of new models.

Hair patterns in mice. Shapes of human molars. Collective butterfly motion. Evolution of soil thicknesses. Interactions of trading strategies. Clustering of red blood cells in capillaries. Patterns of worm appendages. Shapes of galaxies. Effects of fires on ecosystems. Structure of stromatolites. Patterns of leaf stomata operation. Spatial spread of influenza in hospitals. Pedestrian traffic flow. Skin cancer development. Size distributions of companies. Microscopic origins of friction. And many, many more.

One of the key lessons of NKS is that even when a phenomenon appears complex, there may still be a simple underlying model for it. And to me one of the most interesting features of the applied NKS literature is that over the course of the decade typical successful models have been getting simpler and simpler—presumably as people get more confident in using the methods and ideas of NKS.

Of NKS-based models now in use, the vast majority are still based on cellular automata—the very first simple programs I studied back in the early 1980s. There’s been some use of substitution systems, mobile automata and various graph-based systems. But cellular automata are still definitely the leaders. And beginning with my work on them in the early 1980s, there’s been steady growth in their use for nearly 30 years. (Notably, the numbers of papers about them firmly overtook the number of papers about Turing machines around 1995.)

papers about cellular automata

It’s notable that the breakdown of cellular automaton papers is distinctly different from the breakdown of all NKS papers:

Cellular automaton papers published identified by journal fields

And it’s a great testament to the importance of simple rules that even among the 256 simplest possible cellular automata, there’ve now been papers written about almost every single one of them. (The top rules, which also happen to be top in the NKS book, are rule 110, rule 30 and rule 90—which respectively show provable computation universality, high-quality randomness generation and additive nested structure capable of mathematical analysis).

papers by cellular automaton rule number

Many of these papers are applied NKS, using cellular automata as models (rule 184 for traffic flow, rule 90 for catalysis, rule 54 for phase transitions, etc.). But many are also pure NKS, studying the cellular automata for their own sake.

In a sense, each possible cellular automaton rule is a world unto itself. And over the course of the past decade, a whole variety of pockets of literature have developed around specific rules—typically using various computational and mathematical methods to identify and prove features of their behavior (nesting in the boundary of rule 30, blocks in rule 146, logic structures in rule 54, etc.) And in general, it continues to amaze me just how many things are still being discovered even about the very simplest possible cellular automaton rules—that I first studied 30 years ago.

One of my favorite activities when I worked on the NKS book was just to go out into the computational universe, and explore what’s there. In a sense it’s like quintessential natural science—but now concerned with exploring not stars and galaxies or biological flora and fauna, but instead abstract programs. Over the years I’ve developed a whole methodology for exploring the computational universe, and for doing computer experiments. And it’s interesting to see the gradual spread of this methodology—rather obviously visible, for example, in all sorts of papers with pictures that look as if they could have come straight from the NKS book. (Most often, there are arrays of black-and-white rasters and the like; the more elaborate style of algorithmic diagrams in the NKS book are still much rarer—even though recent versions of Mathematica have made them easier to make.)

There’s been an issue, though, in that raw explorations of the computational universe don’t fit terribly well into current patterns of academia and academic publishing. We’ve tried to do what we can with our Complex Systems journal (published since 1986), but ultimately there are new venues and models needed, based more on structured knowledge and less on academic narrative. Soon after the NKS book came out, we did some experiments on developing an “Atlas of Simple Programs”—but we realized that to make this truly useful, we’d need something not so much like an atlas, but more like a computational knowledge engine. And of course, now with Wolfram|Alpha we have just that—and we’re gradually adding lots of NKS knowledge to it:

If one looks at pure NKS work over the past decade, the vast majority has been concerned with types of systems that I already discussed in some form or another in the book. Cellular automata remain the most common, but there are also recursive sequences, substitution systems, network systems, tag systems, and all sorts of others. And there’ve been some significant generalizations, and some new kinds of systems. Cellular automata with non-local rules, with memory, or on networks. Turing machines with skips. Iterated finite automata. Distance transform automata. Planar trinets. Generalized reversal-addition systems. And others.

Much of the systematic work that’s been done on pure NKS has been by “professional scientists” who publish in traditional academic journals. But some of the most interesting and innovative work has actually been done by amateurs (most often people involved in some facet of the computer industry) who have the advantage of not having to follow the constraints of academic publishing—but at least for now have the disadvantage that there is no centralized venue for the contributions.

In pure NKS, one often starts from raw observations of the computational universe, but then moves on to detailed analysis, and to the formulation of increasingly general principles. And in the past 10 years all sorts of analyses have been done. Classification of compressibility of outputs from simple programs. Possible forms of boundaries for cellular automaton growth. Empirical analysis of Turing machine running times. Sequences of possible periods in cellular automata. Symmetries of finite Turing machine state transition diagrams. Universal algebra properties of cellular automata. And lots more.

Needless to say, there are still plenty of things that remain far out of reach—like proving any significant degree of randomness for sequences generated by rule 30. But in 2007, to celebrate the fifth anniversary of the publication of the NKS book, I decided to issue a challenge—and offered a $25,000 prize for determining whether or not a particular simple Turing machine from the book is computation universal. I thought this challenge might remain open for a century—but I was thrilled when after just a few months, it was answered in the affirmative by a young English computer science student.

Not only did this establish the identity of the very simplest universal Turing machine—but it also provided an important new piece of evidence for what I consider to be the most powerful principle to emerge so far in pure NKS: the Principle of Computational Equivalence.

In academic work on pure NKS, there’s been a definite strand of attempts to further formalize concepts and principles in the book—and perhaps prove or disprove them—most often using methods from mathematical logic or theoretical computer science. The four classes of cellular automaton behavior. The Principle of Computational Equivalence. Computational irreducibility. Intrinsic randomness. Over the past decade all sorts of results for particular simplified versions or situations have been obtained—but in the absence of major new methods, nothing complete or definitive has emerged.

Beyond pure and applied NKS, another important component of the NKS academic literature is work based on the NKS way of thinking. Quite often NKS has been used in connection with questions about the methodology of science and the foundations of modeling. But there’s also all sorts of use of NKS in thinking about mainstream issues in fields like philosophy, social science, theology, economics, psychology, politics, law and management science.

My impression is that some important things are happening here. It’s taken a long time for the traditional mathematical approach to science to be applied at all to these areas. And in many cases it’s not been terribly helpful. But with the NKS way of thinking there’s a whole new way that ideas from science can enter. Which seems both to clarify many longstanding issues, and to make it possible to discuss all sorts of new ones.

Free will in philosophy. Possible organizational structures in management science. Theoretical controllability in economics. Liability for consequences in law. There are surprisingly definite things that have been said about all these on the basis of NKS ideas.

One of the things that’s happened over the decade since the NKS book was published is the development of countless fields that go by the name “computational ___”. There’s computational philosophy, computational history, computational social science, computational law—and many others.

And perhaps one of the most important uses of NKS and the NKS way of thinking is that it provides a core intellectual foundation for these fields. There’s usually an important level at which one’s dealing mainly with practical computation (often using Mathematica). But when one’s concerned with foundational theoretical principles, NKS very often seems to be the key. And as one traces the expansion of these new fields, one can see rather clearly in the academic literature a corresponding spreading of NKS ideas.

But in the effort to get a global view of the progress of NKS, the academic literature is just one part of the story.

There’s also all sorts of very diverse discussion of NKS ideas in blogs, news commentaries, opinion pieces and the like. Sometimes NKS is used to bolster or attack some political argument or another. Sometimes NKS is invoked just to show that there are new and different ideas to consider. And sometimes NKS is used in ways where it’s hard (at least for me) to tell if it makes sense (relations to Eastern religious traditions, unusual sensory experiences, etc.).

It’s been fun over the past decade to watch NKS gradually make its way into popular culture. Whether it’s in cartoon strips that pithily use some NKS idea. Or in fiction that uses NKS to theme some character. Or whether it’s a cameo appearance of the physical NKS book—or an NKS piece of dialog—on a TV show.

There’ve been quite a few science fiction novels where NKS has been central. Sometimes there’s a hero or villain who pursues NKS research or ideas. Sometimes NKS enters in the understanding of what a future governed by computation would be like. And sometimes there’s some object whose operation or purpose can only be understood in NKS terms.

In a rather different direction, another major use of NKS has been in art.

From my very earliest investigations of cellular automata, we’ve known that they can create rich and aesthetically pleasing visual images. But the publication of the NKS book dramatically accelerated the use of systems like cellular automata for artistic purposes. And in the past decade I’ve seen cellular automaton patterns artistically rendered in paint, knitting, mosaic, sticks, cake decoration, punched holes, wood blocks, smoke, water valves and no doubt all sorts of other media that I can’t immediately even imagine.

Cellular automaton patterns artistically rendered in multiple forms

In the interactive domain, there are countless websites and apps that run cellular automata and other NKS systems. Sometimes what’s done is a fairly straightforward rendering of the underlying system—not unlike what’s in the NKS book. And sometimes there is extensive artistic interpretation done. Sometimes what’s built is used essentially for hobbyist NKS explorations, sometimes for art, and sometimes for pure entertainment.

In games—and movies—NKS systems have been extensively used both to produce detailed effects such as textures, and to handle for example collective behavior of large numbers of similar entities. NKS systems are also used to define overall rules for games or for characters in games.

NKS is not limited to visual form. And indeed NKS systems have been used extensively to generate audio—with probably the most ambitious experiment in this direction being our 2005 WolframTones project.

Of all responses to the NKS book, one of the consistently strongest has been from the architecture community. Many times I have heard how much architects value both the physical book and its illustrations, as well as the ideas about creation of form that it contains.

I don’t know if there’s any large actual building yet constructed from the simple rules of an NKS system. But I know quite a few have been planned. And there’s also been significant work done on landscape architecture and urban planning—not to mention interior design—with NKS systems.

There’ve been many important applications of NKS in the sciences, humanities and arts over the past decade. But if there’s one area where the effect of NKS has been emerging as most significant, it’s technology. Just as the computational universe gives us an inexhaustible supply of new models, so also it gives us an inexhaustible supply of new mechanisms that we can harness for technology. And indeed in our own technology development over the past decade, we have made increasing use of NKS.

A typical pattern is that we identify some task, and then we search the computational universe for a simple program to carry out the task. Often this kind of mining of the computational universe seems like magic—and the programs we discover seem much cleverer than anything we would have ever come up with by our usual engineering methods of step-by-step construction.

Years ago we started using rule 30 as the pseudorandom generator in Mathematica—and more recently we’ve done large-scale search to find other cellular automata that are slightly more efficient for this task. Over the years we’ve also mined the computational universe to find efficient hash codes, function evaluation algorithms, image filters, linguistics algorithms, visual layout methods and much, much more. Sometimes there’s something known from pure NKS that immediately suggests some particular rule or type of system to use; more often we need to do a systematic search from scratch.

It’s a little hard to track how NKS has been used in technology in the world at large. There are a handful of patents that refer to NKS, and a number of organizations have systematically sought my advice about using NKS—but most of what I’ve heard about technology uses of NKS has come from anecdotes and chance encounters.

Sometimes it’s at least somewhat clear on the surface, because there’s some visible NKS-like pattern to be seen—say in window shade arrangements for buildings, antenna designs, contact lens patterns, heat conduction networks, and so on.

But more often it’s a case of “NKS inside”, where there’s some internal algorithm that’s been found by exploring the computational universe. By now I know of a large number of examples. Modular robots. Mesh networks. Cryptography. Image processing. Network routing. Computer security. Concurrency protocols. Computer graphics. Algorithmic texture generation. Probabilistic processor designs. Games. Traffic light control. Voting schemes. Financial trading systems. Precursors to algorithmic drugs. And many more.

It’s not like the academic literature, where (at least in theory) there are citations to be followed. But my strong impression is that the use of NKS in practical technology is starting to dwarf its use in basic research. In part this is just a sign of healthy development in the direction of applications.

But in part it is, I think, also a reflection of limitations in the current structure of academia and basic research. For university departments and research funding programs are typically organized according to disciplines that are many decades old—and are ill equipped to adapt to new directions like NKS that do not fit into their existing structures.

Still, over the past decade, there have sprung up around the world a good number of groups that are successfully carrying out basic NKS research. The story in each case is different. Particular leadership; special reasons for institutional or government support; particular ways to survive “interstitially” in an institution; and so on. But gradually more and more of these pockets are developing, and prospering.

When the NKS book appeared, I visited many universities and research labs—and talked to many university presidents, lab directors, and the like. A common response was excitement not only about the subject matter of NKS, but also about the social phenomenon that at talks and lunches about NKS, people who might have been in different parts of an institution for decades were actually meeting—and interacting—in ways they’d never done before.

For a while I thought it might make sense to use this enthusiasm to help launch strong NKS initiatives at particular existing institutions. But after some investigation, it became clear that this was not going to be any kind of quick process. It wasn’t like at our company, where a significant new initiative can be launched in days. This was something that was going to take years—if not decades—to achieve.

And actually, I pretty much knew what to expect. Because I’d seen something very similar before. Back in the early 1980s, after I made my first discoveries about cellular automata, I realized that studying complex systems with simple rules was going to be an important area. And so I set about developing and promoting what I called “complex systems research” (I avoided the eventually-more-common term “complexity theory” out of a deference to the existing area of theoretical computer science by that name).

I soon started the first research center in the field (the Center for Complex Systems Research at the University of Illinois), and the first journal (Complex Systems). I encouraged the then-embryonic Santa Fe Institute to get into the field (which they did), and did my best to expand support for the field. But things went far too slowly for me—and in 1986 I decided that a better personal strategy would be to build the best possible tools I could (Mathematica), get the best possible personal environment (Wolfram Research), and then have the fun of doing the research myself. And indeed this is how I came to embark on A New Kind of Science, and ultimately to take things in a rather different direction.

So what happened to complexity? Here’s a plot of the number of “complexity institutes” in the world by founding date (at least the ones we could find):

known complexity institutes

And here’s their current geographic distribution (some excellent ones are doubtless missing!):

Geographic distribution of “complexity institutes” in the world

(Fully half of all complexity institutes we identified didn’t tell us their founding dates. The institutes in operation before 1986 were originally rather different from what I called complex systems research, but have evolved to be much closer.)

So what does this tell us? The main thing as far as I’m concerned is that the development of institutions is slow and inexorable. Until we collected the data I had no idea there were now so many complexity institutes in the world. But it took close to three decades to get to this point. I think NKS is off to a quicker start. But it’s not clear what one does to accelerate it, and it’s still inevitably going to take a long time.

I might say that I think the intellectual development of NKS is going much better than it ever did for complexity. Because for NKS there’s a core of pure NKS—as well as the NKS way of thinking—that’s successfully being studied, and building up a larger and larger body of definite formal knowledge. In complexity, however, there was an almost immediate fragmentation into lots of different—and often somewhat questionable—application areas. Insofar as these applications followed traditional disciplinary lines, it was often easier to get institutional support for them. But with almost no emphasis on any kind of core intellectual structure, it’s been difficult for coherent progress to be made.

What’s exciting to see with NKS is that in addition to strong applications—especially in technology—there is serious investment in general abstract core development. Which means that independent of success in any particular application area, there’s a whole intellectual structure that’s growing. And such structures—like for example pure mathematics—have tended to have implications of great breadth, historically spanning millennia.

So how should people find out about NKS? Well, I put all the effort I did into writing A New Kind of Science to make that as easy as possible for as broad a range of people as possible. And I think that for the most part that worked out very well. Indeed, it’s been remarkable over the past decade how many people I’ve run into—often in unexpected places—who seem to have read and absorbed the contents of the NKS book.

And to me what’s been most interesting is how many different walks of life those people come from. Occasionally they’re academics, with specific interests. More often they’re other kinds of intellectually oriented people. Sometimes they’re involved with computers; sometimes they’re not. Sometimes they’re highly educated; sometimes they’re not. Sometimes they’re young; sometimes they’re old. And occasionally they turn out to be famous—often for things that at first seem to have no connection to interest in NKS.

I suppose for me one of the most satisfying things about the spread of NKS is seeing people get so much pleasure out of taking on the serious pursuit of NKS. I have a theory that for almost everyone there is a certain direction that’s really their ideal niche. Too often people go through their lives without finding it. Or they live at a time in history when it simply does not exist. But when something like NKS comes along, there’s a certain set of people for whom this is their thing: this is the direction that really fits them. I’m of course an example, and I feel very fortunate to have found NKS. But what’s been wonderful over the past decade is to see all sorts of other people—at different stages in their lives—having that experience too.

There’s much to do in spreading knowledge about NKS. I started the job with the NKS book. But by now NKS has found its way into plenty of textbooks and popular books. Our Demonstrations Project has lots of interactive demonstrations of NKS concepts. And there are other apps and websites too.

For me a major effort in NKS education has been our Summer School, held every year since 2003. We’ve had a diverse collection of outstanding students, and it’s been invigorating each year to see the variety of NKS projects that they’ve been able to pursue. But I suppose if there’s been one discovery for me from the Summer School, it’s been what an important general educational foundation NKS provides.

One can study NKS for its own sake, just as one can study math or physics for their own sake. But like math or physics, one can also study NKS as a way to develop general patterns of thinking to apply all over the place. This works at the level of education for business executives—and it also works at much lower educational levels.

When I first saw people suggesting that young kids work out cellular automaton evolutions by hand on graph paper, I thought it was a little nutty. But then I realized that it’s a great exercise in precision work, with rather satisfying results, that also teaches a kind of “pre-computer-science” idea of what algorithms are. Oh, and it can also be quite aesthetic, and even has direct connections with nature (yes, teachers ask us where to get patterned mollusc shells).

There have now been many courses at graduate, undergraduate and high school levels that use the NKS book. And while I don’t know everything that’s happened, I know that there have at least been experiments with NKS at middle school and elementary school levels.

For me, one of the interesting things is that it’s so easy for anyone to discover something original in NKS. At the beginning of our Summer School we always ask each student to find an “interesting” cellular automaton—and they inevitably come back with a rule that has never been seen before. The computational universe is so vast—and still so unexplored—that anyone can find their own part of it. And then it’s a question of having a good methodology and being systematic to be able to discover something of real value.

When I look at the NKS book now, I’m pleased how well it has withstood the past 10 years. New things have been discovered, but they do not supersede what’s in the book. And even with all the scrutiny and detailed study the book has received, not a single significant error has surfaced. And the pictures are direct and abstract enough to be in a sense timeless (much like a Platonic solid from ancient Egypt looks the same as one today). Of course, while the paper version of the book is elegant and classic—and seems to me a little ceremonial—the book is now much more read on the web and the iPad.

But what of the detailed content of the book? There are certainly plenty of academic papers that cite specific pages of the book. But as I look through the book, I cannot help but be struck by how much more there is to “mine” from it. In writing the book—with Mathematica at my side—it felt as if I had developed a kind of industrial-scale way of making discoveries. Thousands of them. That I discussed in the main part of the book, or put in tiny print and little diagrams into the notes at the back.

It’s been interesting to see what’s been absorbed, at what rate. Different fields seem to have different characteristic times. Art, for example, is very fast. Within months after the NKS book came out, there was significant activity among artists based on NKS concepts. No doubt because in art there’s always excitement about the new.

The same was true for example in finance and trading. Where again there is a great premium for the new.

But in other fields, things were slower. The younger and more entrepreneurial the field, the faster things seemed to go. And the less set the conceptual framework for the field, the deeper the early absorption would be. In fields like mathematics and physics where there’s a tradition of precise results, some basic technical understanding came quickly. But in these old and mature fields, the rate of real absorption seemed to be incredibly slow. And indeed, if I look at what I consider major gaps in the follow-up to the NKS book, they are centered around mathematics and physics—with a notable area being the work on the fundamental theory of physics in the book.

In a sense, ten years is a short time for the kind of development that I believe NKS represents. Indeed, in my own life so far, I have found that ten years is about the minimum time it takes me to really begin to come to terms with new frameworks for thinking about things. And certainly in the general history of science and of ideas, ten years is no time at all. Indeed, ten years after the publication of Newton’s Principia, for example, almost none of the people who would make the next major advances had even been born yet.

And ten years after the publication of A New Kind of Science, I am personally quite satisfied with what I can see of the development of the field. Certainly there is far to go, but a solid start has been made, many important steps have been taken, and the inexorable progress of NKS seems assured.

↧

Living a Paradigm Shift: Looking Back on Reactions to A New Kind of Science

May 11, 2012, 10:52 am

≫ Next: Looking to the Future of A New Kind of Science

≪ Previous: It’s Been 10 Years: What’s Happened with A New Kind of Science?

(This is the second of a series of posts related to next week’s tenth anniversary of A New Kind of Science. The previous post covered developments since the book was published. )

“You’re destroying the heritage of mathematics back to ancient Greek times!” With great emotion, so said a distinguished mathematical physicist to me just after A New Kind of Science was published ten years ago. I explained that I didn’t write the book to destroy anything, and that actually I’d spent all those years working hard to add what I hoped was an important new chapter to human knowledge. And, by the way—as one might guess from the existence of Mathematica—I personally happen to be quite a fan of the tradition of mathematics.

He went on, though, explaining that surely the main points of the book must be wrong. And if they weren’t wrong, they must have been done before. The conversation went back and forth. I had known this person for years, and the depth of his emotion surprised me. After all, I was the one who had just spent a decade on the book. Why was he the one who was so worked up about it?

And then I realized: this is what a paradigm shift sounds like—up close and personal.

I had been a devoted student of the history of science for many years, so I thought I knew the patterns. But it was different having it all unfold right around me.

I had been building up the science in the book for the better part of 20 years. And I had been amazed—almost shocked—at many of the things I’d discovered. And I knew that communicating it all to the world wouldn’t be easy.

In the early years, I’d just done what scientists typically do, publishing papers in academic journals and giving talks at academic conferences. And that had gone very well. But after I built Mathematica, I started being able to discover things faster and faster. I had a great time. And pretty soon I had material for many tens—if not hundreds—of academic papers. And what’s more, the things I was discovering were starting to fit together, and give me a whole new way of thinking.

What was I going to do with all this? I suppose I could have just kept it all to myself. After all, by that time I was the CEO of a successful company, and certainly didn’t make my living by publishing research. But I thought what I was doing was important, and I really liked the idea of giving other people the opportunity to share the enjoyment of the things I was discovering. So I had to come up with some way to communicate it all. Publishing lots of piecemeal academic papers in the journals of dozens of fields wasn’t going to work. And instead it seemed like the best path was just to figure out as much as I could, and then present it to the world all together in a coherent way. Which in practice meant I had to write a book.

Back in 1991 I thought it might take me a year, maybe two, to do this. But I kept on discovering more and more. And in the end it took nearly 11 years before I finally finished what I had planned for A New Kind of Science.

But whom was the book supposed to be for? Given all the effort I had put in, I figured that I should make it as widely accessible as possible. I’m sure I could have invented some elaborate technical formalism to describe things. But instead I set myself the goal of explaining what I’d discovered using just plain language and pictures. And as it turned out, countless times doing this helped me clarify my own thinking. But it also made it conceivable for immensely more people to be able to read and understand the book.

I knew full well that all this was very different from the usual pattern in science. Most of the time, front-line research gets described first in academic papers written for experts. And by the time it gets into books—and especially broadly accessible ones—it’s not new, and it’s usually been very watered down. But in my case, many years of front-line discoveries would first get described in a broadly accessible book.

Even in the preface I wrote for the book, I expressed concern about how specialist scientists would react to this. But my personal decision was that it was worth it. And when the book came out, it indeed for the most part worked out spectacularly. Most important as far as I was concerned was that a huge spectrum of people were able to read and understand the book. And in fact lots of people specifically thanked me for writing the book so it was accessible to them.

Many specialist scientists were also highly enthusiastic about the book. But much as I had expected, there was a certain component who just assumed that anything presented in a bestselling book couldn’t really be important new science—and pretty much stopped there.

And then there were others for whom the book just seemed irrelevant; they were happy with their ways of doing and thinking about science, and they weren’t interested—at least at that time—in any particular injection of new ideas.

But what about people whose work was in some way or another directly connected to issues discussed in the book? I have to say that by and large I expected a very positive reaction from such people. After all, I had put all this meticulous work into studying things that they were interested in—and I believed I had come up with some exciting results. And what’s more, I personally knew many of these people—and lots of them had benefited greatly from my efforts to develop the field of complex systems research some 15 or so years earlier.

So discussions like the one I described at the beginning of this post at first came as something of a shock. Of course, I had spent quite a few years as an academic, so I was well aware of the petty bickering and backstabbing endemic to that profession. But this was something different: this was people who were somehow deeply upset by what I was doing.

Some of the more sophisticated and forthright of them were pretty explicit with me, at least in person. Typically there was a surface reason for their reaction, and a deeper reason. Sometimes the surface reason related to content, sometimes to form. Those who discussed content fell into two main groups. The first—particularly populated by physicists—said that their immediate reason for being upset was basically that “If what you’re doing is right, we’ve spent our whole careers barking up the wrong tree”. The second group—particularly populated by people who’d studied areas related to complexity—basically said: “If people buy into what you’ve done, it’ll overshadow everything we’ve done”.

To me what was perhaps most striking was that these reactions were often coming from some of the best-established and most senior people in their fields. I suppose at some level it was flattering, but I had certainly not expected this kind of insecurity—not least because I thought it was completely unfounded. Yes, I believed new approaches were needed—that’s why I’d spent so many years developing them. But I saw what I had done as complementing and adding to what had been done before, not replacing or overturning it.

And then there was the matter of form. ”You’ve done something that’s academic-like, but you haven’t played by academic rules.” It was true: I wasn’t an academic and I wasn’t operating according to the constraints of academia; I was just trying to invent the best possible ways to do things, given my resources and the discoveries I was making.

A typical issue that came up was how the book was vetted or checked. In academia, there’s the idea that “peer review” is the ultimate method of checking anything. And perhaps in a world where everyone has infinite time, and nobody operates according to their own self-interest, this might be true. But in reality, peer review is fraught with error, often quite corrupt, and even in the best case strongly biased toward avoiding new ideas and maintaining the status quo. And for a piece of work as large, broad and complex as A New Kind of Science, even the basic mechanics of it seemed completely impractical.

So what did I do instead? It was a big exercise in perfectionism. First, we built a large system for automated testing, modeled on what we’d developed over the course of many years for Mathematica. And then we developed a process for getting experts in different areas to look at every page of the book, checking as far as possible every detail. So how did it work out? Impressively well. For even now, 10 years later, after every page of the book has been read and scrutinized by huge numbers of people, and every computational result has been reproduced many times, in all the 1280 pages of the book no errors much beyond simple typos have come to light.

One of the many challenges with the book was the practicality of actually printing and publishing it. Early on, I had hoped that one of the large publishers who wanted to publish the book would be able to handle it. But after awhile it became clear that their production methods and business models could not realistically handle the level of visual quality that the content of the book required. And so, somewhat reluctantly at first, I decided to have Wolfram Media do the publishing instead.

This certainly allowed the book to be printed at higher quality, and to be sold more cheaply. But the setup definitely seemed not to please some academics—particularly, I suspect, because it made clear that the book was simply beyond the reach of any academic network, however powerful that network might be in the academic world. Even if a couple of times I did hear things like “I’m so shocked about [some aspect of your book] that I’m going to campaign my university not to use Mathematica”.

If one looks at a standard academic paper, one of its prominent elements is always a list of references—in principle a list of authorities for statements in the paper. But A New Kind of Science has no such list of references. And to some academics this seemed absolutely shocking. What was I thinking? I always consider history important—both for giving credit and for letting one better understand the context of ideas. And when I wrote A New Kind of Science, I resolved that rather than just throwing in disembodied references, I would actually do the work of trying to unravel and explain the detailed histories of things.

And the result was that of the nearly 300,000 words of notes at the back of the book, a significant fraction are about history. I did countless hours of (often fascinating) primary interviews and went through endless archives—and in the end was rather proud of the level of historical scholarship I managed to achieve. And when it came to traditional references I figured that rather than using yet more printed pages, I should just include in the notes appropriate names and keywords, from which anyone—even with the state of web search in 2002—could readily find whatever primary literature they wanted, at greater depth and more conveniently than from lists of journal page numbers.

When A New Kind of Science came out, I kept on hearing complaints that it didn’t refer to this or that person or piece of work. And each time I would check. And to my frustration, in almost every case it was right there in the book—with a whole extended historical story. And it wasn’t as if when people actually read it, they disagreed with the history I’d written. Indeed, to the contrary, many times people told me they were impressed at how accurate and balanced my account was—and often that they’d learned new things even about pieces of history in which they were personally involved.

So why were people complaining? I think it was somehow just disorienting for academics not to be able to glance down a definite “references section”—and see papers they’d authored or otherwise knew. But I’m pretty sure it was more an emotional than a functional issue. And as one indication, after the book came out we did the experiment of putting on the web—in standard academic reference format—the list of the 2600 or so books that I’d used in writing A New Kind of Science. And from our web statistics we know that vastly fewer people used this than for example the online version of even one chapter’s worth of historical notes. (Even so, as a matter of completeness, I’m hoping one day to link all my archives of papers to the online book.)

I suppose another feature of the book that did not endear it to some academics was the very intensity of positive reaction that accompanied its release. Within days there were hundreds of articles in the media describing the ideas in the book—with journalists often doing an impressive job of understanding what the book had to say. And then, mostly slightly later, reviews started to appear. Some were detailed and well reasoned; others were quite rushed, and often seemed mainly to be emotional responses—probably more based on reading earlier reviews than on the reading of the actual 1280-page book itself. And after such a positive initial wave of media attention, later (often “catch up”) coverage inevitably tended to swing to the more negative.

Reviews of A New Kind of Science When the book came out, I had all the reviews we could find diligently archived. And I always intended at some point to systematically read them. But somehow a decade has gone by, and I have not done so. And as I write this post, I have on my desk a daunting pile of printed copies of reviews, as thick as the book itself. But back when they were archived, for reasons I don’t now know, each review was at least put into a “star-rating” category, which we can now use to make a pie chart. And while I’m not sure just how much these statistics really mean, it is perhaps interesting that positive—or at least neutral—reviews overall significantly outweighed negative ones.

So what about the negative reviews? There are certainly some colorful quotes in them. “Why has this undoubtedly brilliant, worthily successful man written such a silly book?… I think it… likely that the book will be forgotten in a few months.” “There’s a tradition of scientists approaching senility to come up with grand, improbable theories. Wolfram is unusual in that he’s doing this in his 40s.” “Is this stuff really that important? Well… maybe. Frankly, I doubt it.” “After looking at hundreds of Wolfram’s pictures, I felt like the coal miner in one of the comic sketches in Beyond the Fringe, who finds the conversation down in the mines unsatisfying: ‘It’s always just “Hallo, ‘ere’s a lump of coal.”’” “With extreme hubris, Wolfram has titled his new book on cellular automata ‘A New Kind of Science’. But it’s not new. And it’s not science.” “It was not the first time the names Wolfram and Newton have been mentioned in the same breath, and I suppose it might be taken as further evidence of an ego bursting all bounds.” And, perhaps my favorite, a whole review simply titled “A Rare Blend of Monster Raving Egomania and Utter Batshit Insanity”.

Realistically, much of the space in these reviews was not devoted to discussing the actual content of the book. High on the list of issues they discussed was that the book had not gone through an academic peer review process, and so could “not be considered an academic work” (not that it was meant to be). Then there were the complaints about the absence of explicit lists of references—often rather misleadingly with no mention of all the detailed historical notes, or perhaps a grudging comment that they were in too small a font.

Another common complaint was that the book was somehow just too grandiose. And for sure, any book with a title like “A New Kind of Science” runs the risk of being characterized that way. To be clear, I believed—and very much still believe—that what’s in A New Kind of Science is very important. In presenting it, though, I suppose I could have somehow tried to hide this. But I was fairly sure that doing so would have a bad effect on peoples’ ability to understand what was in the book.

The issue is quite familiar to those of us who have written lots of documentation for computer systems: if you have big ideas to communicate, you have to prime people for them—or they inevitably get confused. Because if people think something is a small idea, they’ll try to understand it by straightforwardly extending what they already know. And when that doesn’t work, they’ll just be confused. On the other hand, if you communicate up front that something is big and important, then people will make the effort to understand it on its own terms—and will much more readily be able to place and absorb it. And so—well aware of the potential for being accused of grandiosity—I made the decision that it was better for the science if I was explicit about what I thought was important, and how important I thought it was.

Looking through reviews, there are some other common themes. One is that A New Kind of Science is a book about cellular automata—or worse, about the idea (not in fact suggested in the book at all) that our whole universe is a giant cellular automaton. For sure, cellular automata are great, visually strong, examples for lots of phenomena I discuss. But after about page 50 (out of 1280), cellular automata no longer take center stage—and notably are not the type of system I discuss in the book as possible models for fundamental physics.

Another theme in some reviews is that the ideas in the book “do not lead to testable predictions”. Of course, just as with an area like pure mathematics, the abstract study of the computational universe that forms the core of the book is not something which in and of itself would be expected to have testable predictions. Rather, it is when the methods derived from this are applied to systems in nature and elsewhere that predictions can be made. And indeed there are quite a few of these in the book (for example about repeatability of apparent randomness)—and many more have emerged and successfully been tested in work that’s been done since the book appeared.

Interestingly enough, the book actually also makes abstract predictions—particularly based on the Principle of Computational Equivalence. And one very important such prediction—that a particular simple Turing machine would be computation universal—was verified in 2007.

There are reviews by people in specific fields—notably mathematics and physics—that in effect complain that the book does not follow the methodology of their field, which is of course why the book is titled “A New Kind of Science”. There are reviews by various academics with varied “it’s been done before” claims. And there are a few reviews with specific technical complaints, about definitions or about phenomena like the emergence of quantum effects from essentially deterministic systems. Sometimes the issues brought up are interesting. But so far as I know not a single review brought up any specific relevant factual issue that wasn’t in some way already addressed in the book.

And reading through negative reviews, the single most striking thing to me is how shrill and emotional most of them are. Clearly there’s more going on than just what’s being said. And this is where the paradigm shift phenomenon comes in. Most people get used to doing science (or other things) in some particular way. And there’s a natural tendency to want to just go on doing things the same way. And there’s no issue with that for people whose subject matter is sufficiently far away—and who can successfully say, “I just don’t care about your new kind of science”. But for people whose subject matter is closer, that doesn’t work. And that’s when the knives really come out.

I have to say that to me (as I discussed in the previous post) the progress of NKS seems quite inexorable, and unavoidable—and indeed a decade after the publication of the book, seems to be progressing well. But I think some of the reviewers of A New Kind of Science convinced themselves that if what they wrote was negative enough they could derail things, and maybe allow their old directions and paradigms to continue unperturbed. And perhaps the mathematical physicist mentioned at the beginning of this post expressed their attitude most clearly when he said in our conversation: “You’re one of the most brilliant people I know… but you should keep out of science”.

There’s lots of analysis that could be done of the dynamics of opinions about A New Kind of Science. In 2002, there were fewer venues than today for public comments to be made. But I suspect that enough existed that it would be possible to piece together much of what happened. And I think it makes a fascinating study in the history of science.

I suppose I am myself by nature a positive person. And no doubt that is a necessary trait if one is going to do the kinds of large projects to which I have devoted most of my life. I am also at this point someone who is much more interested in just doing things than in other peoples’ assessments of what I do. No doubt others would find attacks on as important a personal project as A New Kind of Science dispiriting. But I have to say that first and foremost my reaction was one of scientific interest. Having studied so much history about paradigm shifts, I found it fascinating to be right in the middle of one myself.

I certainly wondered what one could predict from the dynamics of what was going on. And here history has some interesting lessons. For it suggests that perhaps the single best predictor of good long-term outcomes from potential paradigm shifts is how emotional people get about them at the beginning. So for NKS, all that early turbulence in the end just helps fuel my optimism for its long-term importance and success. And a decade out, not least with everything my previous post discussed, things indeed seem to be well on their way.

↧

Looking to the Future of A New Kind of Science

May 14, 2012, 8:41 am

≫ Next: Announcing Wolfram SystemModeler

≪ Previous: Living a Paradigm Shift: Looking Back on Reactions to A New Kind of Science

(This is the third in a series of posts about A New Kind of Science. Previous posts have covered the original reaction to the book and what’s happened since it was published.)

Today ten years have passed since A New Kind of Science (”the NKS book”) was published. But in many ways the development that started with the book is still only just beginning. And over the next several decades I think its effects will inexorably become ever more obvious and important.

Indeed, even at an everyday level I expect that in time there will be all sorts of visible reminders of NKS all around us. Today we are continually exposed to technology and engineering that is directly descended from the development of the mathematical approach to science that began in earnest three centuries ago. Sometime hence I believe a large portion of our technology will instead come from NKS ideas. It will not be created incrementally from components whose behavior we can analyze with traditional mathematics and related methods. Rather it will in effect be “mined” by searching the abstract computational universe of possible simple programs.

And even at a visual level this will have obvious consequences. For today’s technological systems tend to be full of simple geometrical shapes (like beams and boxes) and simple patterns of behavior that we can readily understand and analyze. But when our technology comes from NKS and from mining the computational universe there will not be such obvious simplicity. Instead, even though the underlying rules will often be quite simple, the overall behavior that we see will often be in a sense irreducibly complex.

So as one small indication of what is to come—and as part of celebrating the first decade of A New Kind of Science—starting today, when Wolfram|Alpha is computing, it will no longer display a simple rotating geometric shape, but will instead run a simple program (currently, a 2D cellular automaton) from the computational universe found by searching for a system with the right kind of visually engaging behavior.

This doesn’t look like the typical output of an engineering design process. There’s something much more “organic” and “natural” about it. And in a sense this is a direct example of what launched my work on A New Kind of Science three decades ago. The traditional mathematical approach to science has had great success in letting us understand systems in nature and elsewhere whose behavior shows a certain regularity and simplicity. But I was interested in finding ways to model the many kinds of systems that we see throughout the natural world whose behavior is much more complex.

And my key realization was that the computational universe of simple programs (such as cellular automata) provides an immensely rich source for such modeling. Traditional intuition would have led us to think that simple programs would always somehow have simple behavior. But my first crucial discovery was that this is not the case, and that in fact even remarkably simple programs can produce extremely complex behavior—that reproduces all sorts of phenomena we see in nature.

Rule 1635

And it was from this beginning—over the course of nearly 20 years—that I developed the ideas and results in A New Kind of Science. The book focused on studying the abstract science of the computational universe—its phenomena and principles—and showing how this helps us make progress on a whole variety of problems in science. But from the foundations laid down in the book much else can be built—not least a new kind of technology.

This is already off to a good start, and over the next decade or two I expect dramatic progress in the application of NKS to all sorts of technology. In a typical case, one will start from some objective one wants to achieve. Then, either through knowledge of the basic science of the computational universe, or by some kind of explicit search, one will find a system that achieves this objective—often in ways no human would ever imagine or come up with. We have done this countless times over the years for algorithms used in Mathematica and Wolfram|Alpha. But the same approach applies not just to programs implemented in software, but also to all kinds of other structures and processes.

Today our technological world is full of periodic patterns and other simple forms. But rarely will these ultimately be the best ways to achieve the objectives for which they are intended. And with NKS, by mining the computational universe, we have access to a much broader set of possibilities—which to us will typically look much more complex and perhaps random.

How does this relate to the kinds of patterns and forms that we see in nature? One of the discoveries of NKS is that nature samples a broader swath of the computational universe than we reach with typical methods of mathematics or engineering. But it too is limited, whether because natural selection tends to favor incremental change, or because some physical process just follows one particular rule. But when we create technology, we are free to sample the whole computational universe—so in a sense we can greatly generalize the mechanisms that nature uses.

Some of the consequences of this will be readily visible in the actual forms of technological objects we use. But many more will involve internal structures and processes. And here we will often see the consequences of a central discovery of NKS: the Principle of Computational Equivalence—which implies that even when the underlying rules or components of a system are simple, the behavior of the system can correspond to a computation that is essentially as sophisticated as anything. And one thing this means is that a huge range of systems are capable in effect not just of acting in one particular way, but of being programmed to act in almost arbitrary ways.

Today most mechanical systems we have are built for quite specific purposes. But in the future I have no doubt that with NKS approaches, it will for instance become common to see arbitrarily “programmable” mechanical systems. One example I expect will be modular robots consisting of large numbers of fairly simple and probably identical elements, in which almost any mechanical action can be achieved by an appropriate sequence of small-scale motions, typically combined in ways that were found by mining the computational universe.

Similar things will happen at a molecular level too. For example, today we tend to have bulk materials that are either perfect periodic crystals, or have atoms arranged in a random amorphous way. NKS implies that there can also be “computational materials” that are grown by simple underlying rules, but which end up with much more elaborate patterns of atoms—with all sorts of bizarre and potentially extremely useful properties.

When it comes to computing, we might think that to have a system at a molecular scale act as a computer we would need to find microscopic analogs of all the usual elements that exist in today’s electronic computers. But what NKS shows us is that in fact there can be much simpler elements—more readily achievable with molecules—that nevertheless support computation, and for which the effort of compiling from current traditional forms of computation is not even too great.

An important application of these kinds of ideas is in medicine. Biology is essentially the only existing example where something akin to molecular-scale computation already occurs. But existing drugs tend to operate only in very simple ways, for example just binding to a fixed molecular target. But with NKS methods one can expect instead to create “algorithmic drugs”, that in effect do a computation to determine how they should act—and can also be programmable for different cases.

NKS will also no doubt be important in figuring out how to set up synthetic biological organisms. Many processes in existing organisms are probably best understood in terms of simple programs and NKS ideas. And when it comes to creating new biological mechanisms, NKS methods are the obvious way to take underlying molecular biology and find schemes for building sophisticated functionality on the basis of it.

Biology gives us ways to create particular kinds of molecular structures, like proteins. But I suspect that with NKS methods it will finally be possible to build an essentially universal constructor, that can in effect be programmed to make an almost arbitrary structure out of atoms. The form of this universal constructor will no doubt be found by searching the computational universe—and its operation will likely be nothing close to anything one would recognize from traditional engineering practice.

An important feature of NKS methods is that they dramatically change the economics of invention and creativity. In the past, to create or invent something new and original has always required explicit human effort. But now the computational universe in effect gives us an inexhaustible supply of new, original, material. And one consequence of this is that it makes all sorts of mass customization broadly feasible.

There are many immediate examples of this in art. WolframTones did it for simple musical pieces. One can also do it for all sorts of visual patterns—perhaps ever changing, and selected from the computational universe and then grown to fit into particular spatial or other constraints. And then there is architecture. Where one can expect to discover in the computational universe new forms that can be used to create all sorts of structures. And indeed in the future I would not be surprised if at first the most visually obvious everyday examples of NKS were forms of things like buildings, their dynamics, decoration and structure.

Mass production and the legacy of the industrial revolution have led to a certain obvious orderliness to our world today—with many copies of identical products, precisely repeating processes, and so on. And while this is a convenient way to set things up if one must be guided by traditional mathematics and the like, NKS suggests that things could be much richer. Instead of just carrying out some processes in a precisely repeating way, one computes what to do in each case. And putting together many such pieces of computation the behavior of the system as a whole can be highly complex. And finding the correct rules for each element—to achieve some set of overall objectives—is no doubt best done by studying and searching the computational universe of possibilities.

Viewed from the outside, some of the best evidence for the presence of our civilization on Earth comes from the regularities that we have created (straight roads, things happening at definite times, radio carrier signals, satellite orbits, and so on). But in the future, with the help of NKS methods, more and more of these regularities will be optimized out. Vehicles will move in optimized patterns, radio signals will be transferred in complicated sequences of local hops… and even though the underlying rules may be simple, the actual behavior that is seen will look highly complex—and much more like all sorts of systems in physics and elsewhere that we already see in nature.

There are other—more abstract—situations where computation and NKS ideas will no doubt become increasingly important. One example is in commerce. Already there is an increasing trend toward algorithmic pricing. Increasingly commercial terms and contracts of all kinds will be stated in computational terms. And then—a little like a market of algorithmic traders—there will be what amounts to an NKS issue of what the overall consequences of many separate transactions will be. And again, finding the appropriate rules for these underlying transactions will involve understanding and searching the computational universe—and presumably various kinds of mass customization, that eventually make concepts like money as a simple numerical quantity quite obsolete.

Future schemes for such things as auctions and voting may also perhaps be mined from the computational universe, and as a result may be mass customized on demand. And, more speculatively, the same might be true for future corporate or political organizational structures. Or for example for mechanisms for social and other human networks.

In addition to using NKS in “technology mode” as a way to create things, one can also use NKS in “science mode” as a way to model and understand things. And typically the goal is to find in the computational universe some simple program whose behavior captures the essence of whatever system or phenomenon one is trying to analyze. This was an important focus of the NKS book, and has been a major theme in the past decade of NKS research. In general in science it has been difficult to come up with new models for things. But the computational universe is an unprecedentedly rich source—and I would expect that before long the rate of new models derived from it will come to far exceed all those from traditional mathematical and other sources.

An important trend in today’s world is the availability of more and more data, often collected with automated sensors, or in some otherwise automated way. Often—as we see in many areas of Wolfram|Alpha or in experiments on personal analytics—there are tantalizing regularities in the data. But the challenge that now exists is to find good models for the data. Sometimes these models are important for basic science; more often they are important for practical purposes of prediction, anomaly detection, pattern matching and so on.

In the past, one might find a model from data by using statistics or machine learning in effect to fit parameters of some formula or algorithm. But NKS suggests that instead one should try to find in the computational universe some set of underlying rules that can be run to simulate the essence of whatever generates the data. At present, the methods we have for finding models in the computational universe are still fairly ad hoc. But in time it will no doubt be possible to streamline this process, and to develop some kind of highly systematic methodology—a rough analog of the historical progression from calculus to statistics.

There are many areas where it is clear that NKS models will be important—perhaps because the phenomenon being modeled is too complex for traditional approaches, or perhaps because, as is becoming so common in practice, the underlying system has elements that are specifically set up to be computational.

One area where NKS models seem likely to be particularly important is medicine. In the past, most disorders that medicine successfully addressed were fundamentally either structural or chemical. But today’s most important challenge areas—like aging, cancer, immune response and brain functioning—all seem to be associated more with large-scale systems containing many interacting parts. And it certainly seems plausible that the best models for these systems will be based on simple programs that exist in the computational universe.

In recent times, medicine has slowly been becoming more quantitative. But somehow it is still always based on small collections of numbers, that lead to a small set of possible diagnoses. But between the coming wave of automated data acquisition, and the use of underlying NKS models, I suspect that the future of medicine will be more about dynamic computation than about specific discrete diagnoses. But even given a good predictive model of what is going on in a particular medical situation, it will still often be a challenge to figure out just what intervention to make—though the character of this problem will no doubt change when algorithmic drugs and computational materials exist.

What would be the most spectacular success for NKS models? Perhaps models that lead to an understanding of aging, or cancer. Perhaps more accurate models for social or economic processes. Or perhaps a final fundamental theory of physics.

In the NKS book, I started looking at what might be involved in finding the underlying rules for our physical universe out in the computational universe. I developed some network-based models that operate in a sense below space and time, and from which I was already able to derive some surprisingly interesting features of physics as we know it. Of course, we have no guarantee that our physical universe has rules that are simple enough to be found, say, by an explicit search in the computational universe. But over the past decade I have slowly been building up the rather large software and analysis capabilities necessary to mount a serious search. And if successful, this will certainly be an important piece of validation for the NKS approach—as well as being an important moment for science in general.

Beyond science and technology, another important consequence of a new worldview like NKS is the effect that it can have on everyday thinking. And certainly the mathematical approach to science has had a profound effect on how we think about all kinds of issues and processes. For today, whether we’re talking about business or psychology or journalism, we end up using words and ideas—like “momentum” and “exponential”—that come directly from this approach. Already there are analogs from NKS that are increasingly used—like “computationally irreducible” and “intrinsically random”. And as such concepts become more widespread they will inform thinking about more and more things—whether it’s describing the operation of an organization, or working out what could conceivably be predictable for purposes of liability.

Beyond everyday thinking, the ideas and results of NKS will also no doubt have increasing influence on many areas of philosophical thinking. In the past, most of the understanding for what science could contribute to philosophy came from the mathematical approach to science. But now the new concepts and results in NKS in a sense provide a large number of new “raw facts” from which philosophy can operate.

The principles of NKS are important not only at an intellectual level, but also at a practical level. For they give us ideas about what might be possible, and what might not. For example, the Principle of Computational Equivalence in effect implies that there can be nothing general and abstract that is special about intelligence, and that in effect all its features must just be reflections of computation. And it is this that made me realize soon after the NKS book appeared that my long-term goal of making knowledge broadly computable might be achievable “just with computation”—which is what led me to embark on the Wolfram|Alpha project.

I have talked elsewhere about some of the consequences of the principles of NKS for the long-range future of the human condition. But suffice it to say here that we can expect an increasing delegation of human intellectual activities to computational systems—but with ultimate purposes still of necessity defined by humans and the history of human culture and civilization. And perhaps the place where NKS principles will enter most explicitly is in making future legal and other distinctions about what really constitutes responsibility, or a mind, or a memory as opposed to a computation.

As we look at the future of history, there are some inexorable trends, and then there are some wild cards. If we find the fundamental theory of physics, will we be able to hack it to achieve something like instantaneous travel? Will we find some key principle that lets us reverse aging? Will we be able to map memories directly from one brain to another, without the intermediate step of language? Will we find extraterrestrial intelligence? About all these questions, NKS has much to say.

If we look back at the mathematical approach to science, one of its societal consequences has been the injection of mathematics into education. To some extent, a knowledge of mathematical principles is necessary to interact with the world as it exists today. It is also an important foundation for understanding fields that have made serious use of the mathematical approach to science. And certainly learning mathematics to at least some level is a convenient way to teach precise structured thinking in general.

But I believe NKS also has much to contribute to education. At an elementary level, it can be viewed as a kind of “pre-computer science”, introducing fundamental notions of computation in a direct and often visual way. At a more sophisticated level, NKS provides a conceptual framework for understanding the foundations of many computational fields. And even from what I have seen over the past decade, education about NKS—a little like physics before it—seems to provide a powerful springboard for people entering all sorts of modern areas.

What about NKS research? There is much to be done in the many applications of NKS. But there is also much to be done in pure NKS—studying the basic science of the computational universe. The NKS book—and the decade of research that has followed it—has only just begun to scratch the surface in exploring and investigating the vast range of possible simple programs. The situation is in some ways a little like in chemistry—where there are an infinite variety of possible chemical compounds each with their own features, that can be studied either for their own sake, or for the purpose of inferring general principles, or for diverse potential applications. And where even after a century or more, only a small part of what is possible has been done.

In the computational universe it is quite remarkable how much can be said about almost any simple program with nontrivial behavior. And the more one knows about a given program, the more potential there is to find interesting applications of it, whether for modeling, technology, art or whatever. Sometimes there are features of programs that can be almost arbitrarily difficult to determine. But sometimes they can be important. And so, for example, it will be important to get more evidence for (or against) the Principle of Computational Equivalence by trying to establish computation universality for a variety of simple programs (rule 30 would be a particularly important achievement).

As more is done in pure NKS, so its methodologies will become more streamlined. And for example there will be ever clearer principles and conventions for what constitutes a good computer experiment, and how the results of investigations on simple programs should be communicated. There are fields other than NKS—notably mathematics—where computer experiments also make sense. But my guess is that the kind of exploratory computer experimentation that is a hallmark of pure NKS will always end up largely classified as pure NKS, even if its subject matter is quite mathematical.

If one looks at the future of NKS research, an important issue is how it is structured in the world. Some part of it—like for mathematics—may be driven by education. Some part may be driven by applications, and their commercial success. But in the long term just how the pure basic science of NKS should be conducted is not yet clear. Should there be prizes? Institutions? Socially oriented value systems? As a young field NKS has the potential to take some novel approaches.

For an intellectual framework of the magnitude of NKS, a decade is a very short time. And as I write this post, I realize anew just how great the potential of NKS is. I am proud of the part I played in launching NKS, and I look forward to watching and participating in its progress for many years to come.

↧

Announcing Wolfram SystemModeler

May 23, 2012, 9:50 am

≫ Next: Happy 100th Birthday, Alan Turing

≪ Previous: Looking to the Future of A New Kind of Science

Today I’m excited to be able to announce that our company is moving into yet another new area: large-scale system modeling. Last year, I wrote about our plans to initiate a new generation of large-scale system modeling. Now we are taking a major step in that direction with the release of Wolfram SystemModeler.

SystemModeler is a very general environment that handles modeling of systems with mechanical, electrical, thermal, chemical, biological, and other components, as well as combinations of different types of components. It’s based—like Mathematica—on the very general idea of representing everything in symbolic form.

In SystemModeler, a system is built from a hierarchy of connected components—often assembled interactively using SystemModeler’s drag-and-drop interface. Internally, what SystemModeler does is to derive from its symbolic system description a large collection of differential-algebraic and other equations and event specifications—which it then solves using powerful built-in hybrid symbolic-numeric methods. The result of this is a fully computable representation of the system—that mirrors what an actual physical version of the system would do, but allows instant visualization, simulation, analysis, or whatever.

Here’s an example of SystemModeler in action—with a 2,685-equation dynamic model of an airplane being used to analyze the control loop for continuous descent landings:

Continuous descent landings for an aircraft shown in Wolfram SystemModeler

There’s a long and tangled history of products that do various kinds of system modeling. The exciting thing about SystemModeler is that from its very foundations, it takes a new approach that dramatically unifies and generalizes what’s possible. In the past, products tended either to be specific to a particular application domain (like electric circuits or hydraulics), or were based on rigid low-level component models such as procedural blocks.

What SystemModeler does is to use a fully symbolic representation of everything, which immediately allows both arbitrary domains to be covered, and much more flexible models for components to be used. In the past, little could have been done with such a general representation. But the major breakthrough is that by using a new generation of hybrid symbolic-numeric methods, SystemModeler is capable of successfully solving for the behavior of even very large-scale such systems.

When one starts SystemModeler, there’s immediately a library of thousands of standard components—sensors, actuators, gears, resistors, joints, heaters, and so on. And one of the key features of SystemModeler is that it uses the new standard Modelica language for system specifications—so one can immediately make use of model libraries from component manufacturers and others.

Libraries

SystemModeler is set up to automate many kinds of system modeling work. Once one’s got a system specified, SystemModeler can simulate any aspect of the behavior of the system, producing visualizations and 3D animations. It can also synthesize a report in the form of an interactive website—or generate a computable model of the system as a standalone executable.

These capabilities alone would make SystemModeler an extremely useful and important new product, for a whole range of industries from aerospace to automotive, marine, consumer, manufacturing, and beyond.

But there’s more. Remember that we have Mathematica too. And SystemModeler integrates directly with Mathematica—bringing in our whole 25-year Mathematica technology stack.

This makes possible many spectacular things. Just like Mathematica can operate on data or images or programs, so now it can also operate on computable models from SystemModeler. This means that it takes just a line or two of Mathematica code to do a parameter sweep, or a sensitivity analysis, or a sophisticated optimization on a model from SystemModeler.

And one gets all of the interface features of Mathematica—being able to do visualizations, instantly introduce interactive controls, or produce computable CDF documents as reports.

But even more than this, one gets to use all of the algorithms and analysis capabilities of Mathematica. So it becomes straightforward to take a model, and do statistical analysis on it, build a control system for it, or export results in any of the formats Mathematica supports.

When one builds models, it’s often important to bring in real-world data, say material properties or real-time weather or cost information. And through its direct link to Wolfram|Alpha—as well as its custom data import capabilities—Mathematica can supply these to SystemModeler.

To me, it’s very satisfying seeing all these parts of our technology portfolio working together. And this is just the beginning. As I discussed in my post last year, it’s going to be possible to integrate system modeling not only with Mathematica, but also at a deep level with Wolfram|Alpha and such things as our mobile apps.

But today, it’s exciting to me to launch Wolfram SystemModeler as a major new direction for our company. Mathematica allows us to represent a vast range of formal and algorithmic systems; SystemModeler extends our reach to large-scale practical engineering and other systems. We already know some of the important things that this will make possible. But I’m sure there will be many wonderful surprises to come in the years ahead, as we gradually realize just what the power of symbolic systems modeling really is.

Wolfram SystemModeler examples

(See the Wolfram SystemModeler website for more information—or check out our new courses about system modeling.)

↧

Happy 100th Birthday, Alan Turing

June 23, 2012, 5:09 am

≫ Next: A Moment for Particle Physics: The End of a 40-Year Story?

≪ Previous: Announcing Wolfram SystemModeler

(This is an updated version of a post I wrote for Alan Turing’s 98th birthday.)

Today (June 23, 2012) would have been Alan Turing’s 100^th birthday—if he had not died in 1954, at the age of 41.

I never met Alan Turing; he died five years before I was born. But somehow I feel I know him well—not least because many of my own intellectual interests have had an almost eerie parallel with his.

And by a strange coincidence, Mathematica’s “birthday” (June 23, 1988) is aligned with Turing’s—so that today is also the celebration of Mathematica’s 24^th birthday.

I think I first heard about Alan Turing when I was about eleven years old, right around the time I saw my first computer. Through a friend of my parents, I had gotten to know a rather eccentric old classics professor, who, knowing my interest in science, mentioned to me this “bright young chap named Turing” whom he had known during the Second World War.

One of the classics professor’s eccentricities was that whenever the word “ultra” came up in a Latin text, he would repeat it over and over again, and make comments about remembering it. At the time, I didn’t think much of it—though I did remember it. Only years later did I realize that “Ultra” was the codename for the British cryptanalysis effort at Bletchley Park during the war. In a very British way, the classics professor wanted to tell me something about it, without breaking any secrets. And presumably it was at Bletchley Park that he had met Alan Turing.

A few years later, I heard scattered mentions of Alan Turing in various British academic circles. I heard that he had done mysterious but important work in breaking German codes during the war. And I heard it claimed that after the war, he had been killed by British Intelligence. At the time, at least some of the British wartime cryptography effort was still secret, including Turing’s role in it. I wondered why. So I asked around, and started hearing that perhaps Turing had invented codes that were still being used. (In reality, the continued secrecy seems to have been intended to prevent it being known that certain codes had been broken—so other countries would continue to use them.)

I’m not sure where I next encountered Alan Turing. Probably it was when I decided to learn all I could about computer science—and saw all sorts of mentions of “Turing machines”. But I have a distinct memory from around 1979 of going to the library, and finding a little book about Alan Turing written by his mother, Sara Turing.

And gradually I built up quite a picture of Alan Turing and his work. And over the 30+ years that have followed, I have kept on running into Alan Turing, often in unexpected places.

In the early 1980s, for example, I had become very interested in theories of biological growth—only to find (from Sara Turing’s book) that Alan Turing had done all sorts of largely unpublished work on that.

And for example in 1989, when we were promoting an early version of Mathematica, I decided to make a poster of the Riemann zeta function—only to discover that Alan Turing had at one time held the record for computing zeros of the zeta function. (Earlier he had also designed a gear-based machine for doing this.)

Recently I even found out that Turing had written about the “reform of mathematical notation and phraseology”—a topic of great interest to me in connection with both Mathematica and Wolfram|Alpha.

And at some point I learned that a high school math teacher of mine (Norman Routledge) had been a friend of Turing’s late in his life. But even though my teacher knew my interest in computers, he never mentioned Turing or his work to me. And indeed, 35 years ago, Alan Turing and his work were little known, and it is only fairly recently that Turing has become as famous as he is today.

Turing’s greatest achievement was undoubtedly his construction in 1936 of a universal Turing machine—a theoretical device intended to represent the mechanization of mathematical processes. And in some sense, Mathematica is precisely a concrete embodiment of the kind of mechanization that Turing was trying to represent.

In 1936, however, Turing’s immediate purpose was purely theoretical. And indeed it was to show not what could be mechanized in mathematics, but what could not. In 1931, Gödel’s theorem had shown that there were limits to what could be proved in mathematics, and Turing wanted to understand the boundaries of what could ever be done by any systematic procedure in mathematics.

Turing was a young mathematician in Cambridge, England, and his work was couched in terms of mathematical problems of his time. But one of his steps was the theoretical construction of a universal Turing machine capable of being “programmed” to emulate any other Turing machine. In effect, Turing had invented the idea of universal computation—which was later to become the foundation on which all of modern computer technology is built.

At the time, though, Turing’s work did not make much of a splash, probably largely because the emphasis of Cambridge mathematics was elsewhere. Just before Turing published his paper, he learned about a similar result by Alonzo Church from Princeton, formulated not in terms of theoretical machines, but in terms of the mathematics-like lambda calculus. And as a result Turing went to Princeton for a year to study with Church—and while he was there, wrote the most abstruse paper of his life.

The next few years for Turing were dominated by his wartime cryptanalysis work. I learned a few years ago that during the war Turing visited Claude Shannon at Bell Labs in connection with speech encipherment. Turing had been working on a kind of statistical approach to cryptanalysis—and I am extremely curious to know whether Turing told Shannon about this, and potentially launched the idea of information theory, which itself was first formulated for secret cryptanalysis purposes.

After the war, Turing got involved with the construction of the first actual computers in England. To a large extent, these computers had emerged from engineering, not from a fundamental understanding of Turing’s work on universal computation.

There was, however, a definite, if circuitous, connection. In 1943 Warren McCulloch and Walter Pitts in Chicago wrote a theoretical paper about neural networks that used the idea of universal Turing machines to discuss general computation in the brain. John von Neumann read this paper, and used it in his recommendations about how practical computers should be built and programmed. (John von Neumann had known about Turing’s paper in 1936, but at the time did not recognize its significance, instead describing Turing in a recommendation letter as having done interesting work on the Central Limit Theorem.)

It is remarkable that in just over a decade Alan Turing was transported from writing theoretically about universal computation, to being able to write programs for an actual computer. I have to say, though, that from today’s vantage point, his programs look incredibly “hacky”—with lots of special features packed in, and encoded as strange strings of letters. But perhaps to reach the edge of a new technology it’s inevitable that there has to be hackiness.

And perhaps too it required a certain hackiness to construct the very first universal Turing machine. The concept was correct, but Turing quickly published an erratum to fix some bugs, and in later years, it’s become clear that there were more bugs. But at the time Turing had no intuition about how easily bugs can occur.

Turing also did not know just how general or not his results about universal computation might be. Perhaps the Turing machine was just one model of a computational process, and other models—or brains—might have quite different capabilities. But gradually over the course of several decades, it became clear that a wide range of possible models were actually exactly equivalent to the machines Turing had invented.

It’s strange to realize that Alan Turing never appears to have actually simulated a Turing machine on a computer. He viewed Turing machines as theoretical devices relevant for proving general principles. But he does not appear to have thought about them as concrete objects to be explicitly studied.

And indeed, when Turing came to make models of biological growth processes, he immediately started using differential equations—and appears never to have considered the possibility that something like a Turing machine might be relevant to natural processes.

When I became interested in simple computational processes around 1980, I also didn’t consider Turing machines—and instead started off studying what I later learned were called cellular automata. And what I discovered was that even cellular automata with incredibly simple rules could produce incredibly complex behavior—which I soon realized could be considered as corresponding to a complex computation.

I probably simulated my first explicit Turing machine only in 1991. To me, Turing machines were built a little bit too much like engineering systems—and not like something that would likely correspond to a system in nature. But I soon found that even simple Turing machines, just like simple cellular automata, could produce immensely complex behavior.

In a sense, Alan Turing could easily have discovered this. But his intuition—like my original intuition—would have told him that no such phenomenon was possible. So it would likely only have been luck—and access to easy computation—that would have led him to find the phenomenon.

Had he done so, I am quite sure he would have become curious about just what the threshold for his concept of universality would be, and just how simple a Turing machine would suffice. In the mid-1990s, I searched the space of simple Turing machines, and found the smallest possible candidate. And after I put up a $25,000 prize, in 2007 Alex Smith showed that indeed this Turing machine is universal.

No doubt Alan Turing would quite quickly have grasped the significance of such results for thinking about both natural processes and mathematics. But without the empirical discoveries, his thinking did not progress in this direction.

Instead, he began to consider from a more engineering point of view to what extent computers should be able to emulate brains, and he invented ideas like the Turing Test. Reading through his writings today, it is remarkable how many of his conceptual arguments about artificial intelligence still need to be made—though some, like his discussion of extrasensory perception, have become quaintly dated.

And looking at his famous 1950 article on “Computing Machinery and Intelligence” one sees a discussion of programming into a machine the contents of Encyclopaedia Britannica—which he estimates should take 60 workers 50 years. I wonder what Alan Turing would think of Wolfram|Alpha, which, thanks to progress over the past 60 years, and perhaps some cleverness, has so far taken at least slightly less human effort.

In addition to his intellectual work, Turing has in recent times become something of a folk hero, most notably through the story of his death. Almost certainly it will never be known for sure whether his death was in fact intentional. But from what I know and have heard I must say that I rather doubt that it was.

When one first hears that Alan Turing died by eating an apple impregnated with cyanide one assumes it must have been intentional suicide. But when one later discovers that he was quite a tinkerer, had recently made cyanide for the purpose of electroplating spoons, kept chemicals alongside his food, and was rather a messy individual, the picture becomes a lot less clear.

I often wonder what Alan Turing would have been like to meet. I do not know of any recording of his voice (though he did once do a BBC radio broadcast). But I gather that even near the end of his life he giggled a lot, and talked with a kind of stutter that seemed to come from thinking faster than he was talking. He seemed to have found it easiest to talk to mathematicians. He thought a little about physics, though doesn’t seem to have ever gotten deeply into it. And he seemed to have maintained a child-like enthusiasm and wonder for many intellectual questions throughout his life.

He was something of a loner, working successively on his own on his various projects. He was gay, and lived alone. He was no organizational politician, and towards the end of his life seems to have found himself largely ignored both by people working on computers and by people working on his new interest of biological growth and morphogenesis.

He was in some respects a quintessential British amateur, dipping his intellect into different areas. He achieved a high level of competence in pure mathematics, and used that as his professional base. His contributions in traditional mathematics were certainly perfectly respectable, though not spectacular. But in every area he touched, there was a certain crispness to the ideas he developed—even if their technical implementation was sometimes shrouded in arcane notation and masses of detail.

In some ways he was fortunate to live when he did. For he was at the right time to be able take the formalism of mathematics as it had been developed, and to combine it with the emerging engineering of his day, to see for the first time the general concept of computation.

It is perhaps a shame that he died 25 years before computer experiments became widely feasible. I certainly wonder what he would have discovered tinkering with Mathematica. I don’t doubt that he would have pushed it to its limits, writing code that would horrify me. But I fully expect that long before I did, he would have discovered the main elements of NKS, and begun to understand their significance.

He would probably be disappointed that 60 years after he invented the Turing test, there is still no full human-like artificial intelligence. And perhaps long ago he would have begun to campaign for the creation of something like Wolfram|Alpha, to turn human knowledge into something computers can handle.

If he had lived a few decades longer, he would no doubt have applied himself to a half dozen more areas. But there is still much to be grateful for in what Alan Turing did achieve in his 41 years, and his modern reputation as the founding father of the concept of computation—and the conceptual basis for much of what I, for example, have done—is well deserved.

Happy posthumous 100^thbirthday, Alan Turing.

A few additional pointers:

Turing machine history in A New Kind of Science » TuringMachine function in Mathematica » Turing machines in the Wolfram Demonstrations Project » Turing machines in Wolfram|Alpha » The Alan Turing Year »

↧

A Moment for Particle Physics: The End of a 40-Year Story?

July 5, 2012, 10:01 am

≫ Next: Wolfram|Alpha Personal Analytics for Facebook

≪ Previous: Happy 100th Birthday, Alan Turing

The announcement early yesterday morning of experimental evidence for what’s presumably the Higgs particle brings a certain closure to a story I’ve watched (and sometimes been a part of) for nearly 40 years. In some ways I felt like a teenager again. Hearing about a new particle being discovered. And asking the same questions I would have asked at age 15. “What’s its mass?” “What decay channel?” “What total width?” “How many sigma?” “How many events?”

When I was a teenager in the 1970s, particle physics was my great interest. It felt like I had a personal connection to all those kinds of particles that were listed in the little book of particle properties I used to carry around with me. The pions and kaons and lambda particles and f mesons and so on. At some level, though, the whole picture was a mess. A hundred kinds of particles, with all sorts of detailed properties and relations. But there were theories. The quark model. Regge theory. Gauge theories. S-matrix theory. It wasn’t clear what theory was correct. Some theories seemed shallow and utilitarian; others seemed deep and philosophical. Some were clean but boring. Some seemed contrived. Some were mathematically sophisticated and elegant; others were not.

By the mid-1970s, though, those in the know had pretty much settled on what became the Standard Model. In a sense it was the most vanilla of the choices. It seemed a little contrived, but not very. It involved some somewhat sophisticated mathematics, but not the most elegant or deep mathematics. But it did have at least one notable feature: of all the candidate theories, it was the one that most extensively allowed explicit calculations to be made. They weren’t easy calculations—and in fact it was doing those calculations that got me started having computers to do calculations, and set me on the path that eventually led to Mathematica. But at the time I think the very difficulty of the calculations seemed to me and everyone else to make the theory more satisfying to work with, and more likely to be meaningful.

At the least in the early years there were still surprises, though. In November 1974 there was the announcement of the J/psi particle. And one asked the same questions as today, starting with “What’s the mass?” (That particle’s was 3.1 GeV; today’s is 126 GeV.) But unlike with the Higgs particle, to almost everyone the J/psi was completely unexpected. At first it wasn’t at all clear what it could be. Was it evidence of something truly fundamental and exciting? Or was it in a sense just a repeat of things that had been seen before?

My own very first published paper (feverishly worked on over Christmas 1974 soon after I turned 15) speculated that it and some related phenomena might be something exciting: a sign of substructure in the electron. But however nice and interesting a theory may be, nature doesn’t have to follow it. And in this case it didn’t. And instead the phenomena that had been seen turned out to have a more mundane explanation: they were signs of an additional (4th) kind of quark (the c or charm quark).

In the next few years, more surprises followed. Mounting evidence showed that there was a heavier analog of the electron and muon—the tau lepton. Then in July 1977 there was another “sudden discovery”, made at Fermilab: this time of a particle based on the b quark. I happened to be spending the summer of 1977 doing particle physics at Argonne National Lab, not far away from Fermilab. And it was funny: I remember there was a kind of blasé attitude toward the discovery. Like “another unexpected particle physics discovery; there’ll be lots more”.

But as it turned out that’s not what happened. It’s been 35 years, and when it comes to new particles and the like, there really hasn’t been a single surprise. (The discovery of neutrino masses is a partial counterexample, as are various discoveries in cosmology.) Experiments have certainly discovered things—the W and Z bosons, the validity of QCD, the top quark. But all of them were as expected from the Standard Model; there were no surprises.

Needless to say, verifying the predictions of the Standard Model hasn’t always been easy. A few times I happened to be at the front lines. In 1977, for example, I computed what the Standard Model predicted for the rate of producing charm particles in proton-proton collisions. But the key experiment at the time said the actual rate was much lower. I spent ages trying to figure out what might be wrong—either with my calculations or the underlying theory. But in the end—in a rather formative moment for my understanding of applying the scientific method—it turned out that what was wrong was actually the experiment, not the theory.

In 1979—when I was at the front lines of the “discovery of the gluon”—almost the opposite thing happened. The conviction in the Standard Model was by then so great that the experiments agreed too early, even before the calculations were correctly finished. Though once again, in the end all was well, and the method I invented for doing analysis of the experiments is in fact still routinely used today.

By 1981 I myself was beginning to drift away from particle physics, not least because I’d started to work on things that I thought were somehow more fundamental. But I still used to follow what was happening in particle physics. And every so often I’d get excited when I heard about some discovery rumored or announced that seemed somehow unexpected or inexplicable from the Standard Model. But in the end it was all rather disappointing. There’d be questions about each discovery—and in later years there’d often be suspicious correlations with deadlines for funding decisions. And every time, after a while, the discovery would melt away. Leaving only the plain Standard Model, with no surprises.

Through all of this, though, there was always one loose end dangling: the Higgs particle. It wasn’t clear just what it would take to see it, but if the Standard Model was correct, it had to exist.

To me, the Higgs particle and the associated Higgs mechanism had always seemed like an unfortunate hack. In setting up the Standard Model, one begins with a mathematically quite pristine theory in which every particle is perfectly massless. But in reality almost all particles (apart from the photon) have nonzero masses. And the point of the Higgs mechanism is to explain this—without destroying desirable features of the original mathematical theory.

Here’s how it basically works. Every type of particle in the Standard Model is associated with waves propagating in a field—just as photons are associated with waves propagating in the electromagnetic field. But for almost all types of particles, the average amplitude value of the underlying field is zero. But for the Higgs field, one imagines something different. One imagines instead that there’s a nonlinear instability that’s built into the mathematical equations that govern it, that leads to a nonzero average value for the field throughout the universe.

And it’s then assumed that all types of particles continually interact with this background field—in such a way as to act so that they have a mass. But what mass? Well, that’s determined by how strongly a particle interacts with the background field. And that in turn is determined by a parameter that one inserts into the model. So to get the observed masses of the particles, one’s just inserting one parameter for each particle, and then arranging it to give the mass of the particle.

That might seem contrived. But at some level it’s OK. It would have been nice if the theory had predicted the masses of the particles. But given that it does not, inserting their values as interaction strengths seems as reasonable as anything.

Still, there’s another problem. To get the observed particle masses, the background Higgs field that exists throughout the universe has to have an incredibly high density of energy and mass. Which one might expect would have a huge gravitational effect—in fact, enough of an effect to cause the universe to roll up into a tiny ball. Well, to avoid this, one has to assume that there’s a parameter (a “cosmological constant”) built right into the fundamental equations of gravity that cancels to incredibly high precision the effects of the energy and mass density associated with the background Higgs field.

And if this doesn’t seem implausible enough, back around 1980 I was involved in noticing something else: this delicate cancellation can’t survive at the high temperatures of the very early Big Bang universe. And the result is that there has to be a glitch in the expansion of the universe. My calculations said this glitch would not be terribly big—but stretching the theory somewhat led to the possibility of a huge glitch, and in fact an early version of the whole inflationary universe scenario.

Back around 1980, it seemed as if unless there was something wrong with the Standard Model it wouldn’t be long before the Higgs particle would show up. The guess was that its mass might be perhaps 10 GeV (about 10 proton masses)—which would allow it to be detected in the current or next generation of particle accelerators. But it didn’t show up. And every time a new particle accelerator was built, there’d be talk about how it would finally find the Higgs. But it never did.

Back in 1979 I’d actually worked on questions about what possible masses particles could have in the Standard Model. The instability in the Higgs field used to generate mass ran the risk of making the whole universe unstable. And I found that this would happen if there were quarks with masses above about 300 GeV. This made me really curious about the top quark—which pretty much had to exist, but kept on not being discovered. Until finally in 1995 it showed up—with a mass of 173 GeV, leaving to my mind a surprisingly thin margin away from total instability of the universe.

There were a few bounds on the mass of the Higgs particle too. At first they were very loose (“below 1000 GeV” etc.). But gradually they became tighter and tighter. And after huge amounts of experimental and theoretical work, by last year they pretty much said the mass had to be between 110 and 130 GeV. So in a sense one can’t be too surprised about the announcement today of evidence for a Higgs particle with a mass of 126 GeV. But explicitly seeing what appears to be the Higgs particle is an important moment. Which finally seems to tie up a 40-year loose end.

At some level I’m actually a little disappointed. I’ve made no secret—even to Peter Higgs—that I’ve never especially liked the Higgs mechanism. It’s always seemed like a hack. And I’ve always hoped that in the end there’d be something more elegant and deep responsible for something as fundamental as the masses of particles. But it appears that nature is just picking what seems like a pedestrian solution to the problem: the Higgs mechanism in the Standard Model.

Was it worth spending more than $10 billion to find this out? I definitely think so. Now, what’s actually come out is perhaps not the most exciting thing that could have come out. But there’s absolutely no way one could have been sure of this outcome in advance.

Perhaps I’m too used to the modern technology industry where billions of dollars get spent on corporate activities and transactions all the time. But to me spending only $10 billion to get this far in investigating the basic theory of physics seems like quite a bargain.

I think it could be justified almost just for the self-esteem of our species: that despite all our specific issues, we’re continuing a path we’ve been on for hundreds of years, systematically making progress in understanding how our universe works. And somehow there’s something ennobling about seeing what’s effectively a worldwide collaboration of people working together in this direction.

Indeed, staying up late to watch the announcement early yesterday morning reminded me more than a bit of being a kid in England nearly 43 years ago and staying up late to watch the Apollo 11 landing and moonwalk (which was timed to be at prime time in the US but not Europe). But I have to say that for a world achievement yesterday’s “it’s a 5 sigma effect” was distinctly less dramatic than “the Eagle has landed”. To be fair, a particle physics experiment has a rather different rhythm than a space mission. But I couldn’t help feeling a certain sadness for the lack of pizazz in yesterday’s announcement.

Of course, it’s been a long hard road for particle physics these past 30 or so years. Back in the 1950s when particle physics was launched in earnest, there was a certain sense of follow-on and “thank you” for the Manhattan project. And in the 1960s and 1970s the pace of discoveries kept the best and the brightest coming into particle physics. But by the 1980s as particle physics settled into its role as an established academic discipline, there began to be an ever stronger “brain drain”. And by the time the Superconducting Super Collider project was canceled in 1993, it was clear that particle physics had lost its special place in the world of basic research.

Personally, I found it sad to watch. Visiting particle physics labs after absences of 20 years, and seeing crumbling infrastructure in what I had remembered as such vibrant places. In a sense it is remarkable and admirable that through all this thousands of particle physicists persisted, and have now brought us (presumably) the Higgs particle. But watching yesterday’s announcement, I couldn’t help feeling that there was a certain sense of resigned exhaustion.

I suppose I had hoped for something qualitatively different from those particle physics talks I used to hear 40 years ago. Yes, the particle energies were larger, the detector was bigger, and the data rates were faster. But otherwise it seemed like nothing had changed (well, there also seemed to be a new predilection for statistical ideas like p values). There wasn’t even striking and memorable dynamic imagery of prized particle events, making use of all those modern visualization techniques that people like me have worked so hard to develop.

If the Standard Model is correct, yesterday’s announcement is likely to be the last major discovery that could be made in a particle accelerator in our generation. Now, of course, there could be surprises, but it’s not clear how much one should bet on them.

So is it still worth building particle accelerators? Whatever happens, there is clearly great value in maintaining the thread of knowledge that exists today about how to do it. But reaching particle energies where without surprises one can reasonably expect to see new phenomena will be immensely challenging. I have thought for years that investing in radically new ideas for particle acceleration (e.g. higher energies for fewer particles) might be the best bet—though it clearly carries risk.

Could future discoveries in particle physics immediately give us new inventions or technology? Years ago things like “quark bombs” seemed conceivable. But probably no more. Yes, one can use particle beams for their radiation effects. But I certainly wouldn’t expect to see anything like muonic computers, antiproton engines or neutrino tomography systems anytime soon. Of course, all that may change if somehow it’s figured out (and it doesn’t seem obviously impossible) how to miniaturize a particle accelerator.

Over sufficiently long times, basic research has historically tended to be the very best investment one can make. And quite possibly particle physics will be no exception. But I rather expect that the great technological consequences of particle physics will rely more on the development of theory than on more results from experiment. If one figures out how to create energy from the vacuum or transmit information faster than light, it’ll surely be done by applying the theory in new and unexpected ways, rather than by using specific experimental results.

The Standard Model is certainly not the end of physics. There are clearly gaps. We don’t know why parameters like particle masses are the way they are. We don’t know how gravity fits in. And we don’t know about all sorts of things seen in cosmology.

But let’s say we can resolve all this. What then? Maybe then there’ll be another set of gaps and problems. And maybe in a sense there’ll always be a new layer of physics to discover.

I certainly used to assume that. But from my work on A New Kind of Science I developed a different intuition. That in fact there’s no reason all the richness we see in our universe couldn’t arise from some underlying rule—some underlying theory—that’s even quite simple.

There are all sorts of things to say about what that rule might be like, and how one might find it. But what’s important here is that if the rule is indeed simple, then on fundamental grounds one shouldn’t in principle need to know too much information to nail down what it is.

I’m pleased that in some particular types of very low-level models I’ve studied, I’ve already been able to derive Special and General Relativity, and get some hints of quantum mechanics. But there’s plenty more we know in physics that I haven’t yet been able to reproduce.

But what I suspect is that from the experimental results we have, we already know much more than enough to determine what the correct ultimate theory is—assuming that the theory is indeed simple. It won’t be the case that the theory will get the number of dimensions of space and the muon-electron mass ratio right, but will get the Higgs mass or some as-yet-undiscovered detail wrong.

Now of course it could be that something new will be discovered that makes it more obvious what the ultimate theory might look like. But my guess is that we don’t fundamentally need more experimental discoveries; we just need to spend more effort and be better at searching for the ultimate theory based on what we already know. And it’s certainly likely to be true that the human and computer resources necessary to take that search a long way will cost vastly less than actual experiments in particle accelerators.

And indeed, in the end we may find that the data necessary to nail down the ultimate theory already existed 50 years ago. But we won’t know for sure except in hindsight. And once we have a credible candidate for the final theory it may well suggest new particle accelerator experiments to do. And it will be most embarrassing if by then we have no working particle accelerator on which to carry them out.

Particle physics was my first great interest in science. And it is exciting to see now after 40 years a certain degree of closure being reached. And to feel that over the course of that time, at first in particle physics, and later with all the uses of Mathematica, I may have been able to make some small contribution to what has now been achieved.

↧

Wolfram|Alpha Personal Analytics for Facebook

August 30, 2012, 11:36 am

≫ Next: Kids, Arduinos and Quadricopters

≪ Previous: A Moment for Particle Physics: The End of a 40-Year Story?

After I wrote about doing personal analytics with data I’ve collected about myself, many people asked how they could do similar things themselves.

Now of course most people haven’t been doing the kind of data collecting that I’ve been doing for the past couple of decades. But these days a lot of people do have a rich source of data about themselves: their Facebook histories.

And today I’m excited to announce that we’ve developed a first round of capabilities in Wolfram|Alpha to let anyone do personal analytics with Facebook data. Wolfram|Alpha knows about all kinds of knowledge domains; now it can know about you, and apply its powers of analysis to give you all sorts of personal analytics. And this is just the beginning; over the months to come, particularly as we see about how people use this, we’ll be adding more and more capabilities.

It’s pretty straightforward to get your personal analytics report: all you have to do is type “facebook report” into the standard Wolfram|Alpha website.

If you’re doing this for the first time, you’ll be prompted to authenticate the Wolfram Connection app in Facebook, and then sign in to Wolfram|Alpha (yes, it’s free). And as soon as you’ve done that, Wolfram|Alpha will immediately get to work generating a personal analytics report from the data it can get about you through Facebook.

Here’s the beginning of the report I get today when I do this:

Facebook report

Yes, it was my birthday yesterday. And yes, as my children are fond of pointing out, I’m getting quite ancient…

I have to admit that I’m not a very diligent user of Facebook (mostly because I have too many other things to do). But I’ve got lots of Facebook friends (most of whom, sadly, I don’t know in real life). And scrolling down in my Wolfram|Alpha personal analytics report, I see this:

Friends' hometowns

Quite a geographic distribution! 85 countries, it says. Nobody from Antarctica, though…

Here’s the age distribution, at least for people who give that data (I do wonder about those 100+ year olds…):

Friends' ages

But these kinds of things are just the beginning. When you type “facebook report”, Wolfram|Alpha generates a pretty seriously long report—almost a small book about you, with more than a dozen major chapters, broken into more than 60 sections, with all sorts of drill-downs, alternate views, etc.

Here’s today’s report for me—which would be a lot longer if I were a more diligent Facebook user:

Full Facebook report

Let’s talk about some of the details. I wish I could do this using my own Facebook data—but I’m just not enough of a Facebook user. So instead I’m going to use data from a few kind souls around our company who’ve agreed to let me share some of their personal analytics.

Close to the top of the report—at least for younger folk—there’s an immediate example of how Wolfram|Alpha’s computational knowledge is used. If it knows from Facebook when and where you were born, it can work out things like what the weather was like (down to the hour in most places—a good memory test for parents!):

More birthdate information

In the standard Wolfram|Alpha Facebook personal analytics report, one of the first major sections is about your general Facebook activity. Here are some results for someone who’s definitely a much more serious Facebook user than me:

Activity

There’s a peak in activity late last fall, when no doubt something was going on in this person’s life. Wolfram|Alpha shows the weekly distribution of all these updates:

Weekly distribution

One can see she does lots of photo posting on Sunday nights, at the end of the weekend. Then there’s a clear gap for sleep, and during standard business hours it’s primarily links and status updates…

What apps does she use? Here’s data on that. Why is there all that Baby Gaga on Tuesdays? I’m guessing that’s something automated.

Weekly app activity

So what’s in someone’s posts? Here’s another part of the standard report, now for a different person:

Post statistics

That’s a nice “most liked post”. Clearly this person (who happens to be the lead developer of the Wolfram|Alpha Facebook personal analytics system that I’m showing here) is pretty upbeat. Look at the word cloud from his posts:

Word cloud

Here’s someone else’s word cloud—notably with her children’s names ordered in size according to age:

Word cloud

You can also look at all sorts of analysis of check-ins, photos, responses to posts, and so on. You can find out which of your posts or photos are most liked, at what times, and by whom:

Most liked photo

A big part of the report Wolfram|Alpha generates is actually not about you, but about your friends.

Like here’s the gender distribution for one person’s friends:

Friends' genders

And here’s their relationship distribution… showing that, yes, as they get older, this person’s friends mostly tend to get progressively more “hitched”:

Friends' relationship statuses

It’s fun to see what names are common among your friends:

Most common friend names

And you can find out things like who you share the most friends with. (For me—with my rather uncurated friend collection—the results were pretty surprising: 2 of the top 5 were people I’d never heard of… though now of course I’m curious about them…)

Your whole network of friends can actually be shown and analyzed as a network. Here’s my network of friends (restricted to female friends, to reduce the number). I’m the big dot at the center. Each other dot represents a friend, arranged based on mutual friendships.

Friend network

It’s clearer if I take myself out of the picture, and just show how my friends are connected to each other:

Friend network

The size of each dot is proportional to the number of friends from my network that that person has. The network is laid out automatically by Wolfram|Alpha, and the colors represent different clusters of friends. It’s interesting to see who my “big connectors” are. If you roll over each dot, you’ll see who it is. The “connector” highlighted here happens to be a long-time former HR director at our company…

Different people seem to end up with very different-looking networks. Here are a few examples:

Multiple friend networks

Sometimes there’s a “biggest connector”—perhaps someone’s spouse. Sometimes there are lots of disjoint clusters (secret lives?). Sometimes—like for my complete friend network, shown in the bottom right—it’s just a big mess, indicating an uncurated collection of friends. And of course you can also use Wolfram|Alpha to do all kinds of fancy graph theory on your friend network—trying to learn for example what “cliques” (in the official graph-theoretic sense) you’re involved with…

OK, so let’s say you’ve found something interesting in your personal analytics report. What can you do with it? We recently introduced a feature called Clip ’n Share. Roll over an image, and a “share” icon comes up. Click it, and you can create a permanent web page that you can link to from Facebook or wherever.

Wolfram|Alpha Clip 'n Share

OK, so that’s some of what you can do with your Wolfram|Alpha personal analytics report. But you can actually get a report not just on yourself, but also (so long as they allow it) on your friends. Just enter “facebook friends” in Wolfram|Alpha to get a list of links to your friends, then follow a link to get the personal analytics report for that friend. (Sometimes you’ll see less than in your own report, because your friends don’t allow you as much access to their data.)

It’s quite fascinating—and sometimes revealing—looking at the personal analytics reports for oneself and one’s friends. I think I could spend ages doing it. And coming back at different times to see what’s changed.

The personal analytics system we’re releasing today is just the beginning. We’re looking forward to everyone’s feedback (use the feedback box at the bottom!)—and we’re planning to keep adding more and more features and capabilities. As I said in my original post about personal analytics, I’ve no doubt that one day pretty much everyone will routinely be doing all sorts of personal analytics on a mountain of data that they collect about themselves. But it’s exciting today to be able to start that off with Wolfram|Alpha Personal Analytics for Facebook. I hope people have fun with it! And perhaps it will also inspire some young Facebook users to become data scientists…

↧

Kids, Arduinos and Quadricopters

October 4, 2012, 10:21 am

≫ Next: Latest Perspectives on the Computation Age

≪ Previous: Wolfram|Alpha Personal Analytics for Facebook

I have four children, all with very different interests. My second-youngest, Christopher, age 13, has always liked technology. And last weekend he and I went to see the wild, wacky and creative technology (and other things) on display at the Maker Faire in New York.

I had told the organizers I could give a talk. But a week or so before the event, Christopher told me he thought what I planned to talk about wasn’t as interesting as it could be. And that actually he could give some demos that would be a lot more interesting and relevant.

Christopher has been an avid Mathematica user for years now. And he likes hooking Mathematica up to interesting devices—with two recent favorites being Arduino boards and quadricopter drones.

And so it was that last Sunday I walked onto a stage with him in front of a standing-room-only crowd of a little over 300 people, carrying a quadricopter. (I wasn’t trusted with the Arduino board.)

Christopher had told me that I shouldn’t talk too long—and that then I should hand over to him. He’d been working on his demo the night before, and earlier that morning. I suggested he should practice what he was going to say, but he’d have none of that. Instead, up to the last minute, he spent his time cleaning up code for the demo.

I must have given thousands of talks in my life, but the whole situation made me quite nervous. Would the Arduino board work? Would the quadricopter fly? What would Christopher do if it didn’t?

I don’t think my talk was particularly good. But then Christopher bounced onto the stage, and soon was typing raw Mathematica code in front of everyone—with me now safely off on the side (where I snapped this picture):

His demo was pretty neat. He had a potentiometer hooked up on the Arduino board. And he’d set it up so that all he had to do was type a command into Mathematica to get its value:

ArduinoAnalogRead[0]

357

Then it was Dynamic[ArduinoAnalogRead[0]], and Mathematica is dynamically displaying the value in real time as he adjusted the potentiometer.

Then he makes it into a gauge (er, that’s actually from a future version of Mathematica, but Christopher is a keen user of internal development builds):

Dynamic[AngularGuage[ArduinoAnalogRead[0], {0, 1023}], UpdateInterval -> 0]

Gauge dynamically displaying the value

And then he says he’s going to make a dynamic plot of it. And pretty soon he’s typing the Mathematica program, confidently presses Shift-Return—and it actually works:

data = {}; Dynamic[rawdata = ArduinoAnalogRead[0]; AppendTo[data, rawdata]; ListLinePlot[data,Filling -> Axis, ImageSize -> 500], UpdateInterval -> 0]

Plot of potentiometer data

Then he’s on to using an ultrasound sensor, and having it produce musical notes based on distance.

And then he’s on to the quadricopter. He’d been going back and forth with someone at our company for a few days before, trying to get the kinks out of the interaction with the quadricopter‘s API. I had seen the quadricopter fly that morning, but I knew Christopher had changed the code quite a bit since then.

His plan was to have a single line of Mathematica code that would make the quadricopter fly a specified 3D path. He had a list of points for a square, entered the line of code, and pressed Shift-Return, and… nothing happened!

I guess Christopher has debugged quite a lot of code in his 13 years. And now he set about doing it in front of the audience. A missing function definition. A missing command to connect to the device. He was finding quite a few things. And I was getting ready to call out that he should just give up.

But then… the sound of quadricopter blades, and up the quadricopter goes… flying its loop on the stage, and landing.

It had actually worked! It was pretty neat, being able to just type one line of code into Mathematica, and then having some physical object fly around in the pattern one had specified:

ARDroneFlyPathGraphics[Table{Sin[u], Sin[2u], {u, 0, 2π, π/5}]]

Path for the quadricopter

After another flight, the audience had questions. One person asked if the quadricopter could respond to its environment. Which set Christopher off on some more “spectator programming”. And actually, it took him only a line of code to get the real-time video from the flying quadricopter, and feed it through simple Mathematica image processing:

I was pretty impressed that all this worked (here is the full video). And, yes, Christopher was clearly right that his topics were very relevant to Maker Faire. In fact, it seemed like Arduino and quadricopters were two of the three main technical themes of the show. The third was 3D printing.

I’d actually mentioned that in my talk (at Christopher’s suggestion), pointing out that Wolfram|Alpha Pro (as well as Mathematica) can immediately make STL versions of any 3D graphic it generates.

And I was reminded that one of my own early applications of 3D printing years before had been connected to another of my children. In 2006, my daughter Catherine (then 9 years old) was very into 3D geometry, and liked exploring the 3D polyhedra that we had introduced in Mathematica 6.

We were just starting the Wolfram Demonstrations Project, and as a sample, we added a little application of polyhedra that Catherine had created with my help:

Catherine had 2D printouts of many different cases, and one day we decided to try making them 3D. It took a little wrangling, but before long Catherine and I were off to a little “3D print shop” full of plastic dust, from which a little zoo of polyhedral koalas emerged:

Collection of polyhedral koalas

Every year there’s more and more for me to learn from my children. My oldest child, now age 16, has become a rather successful and uncannily sophisticated entrepreneur—from whom I’m trying to absorb what business wisdom I can. The other three are not yet as “launched”, but each has their definite interests.

For Christopher it’s technology and product design. Learning about every new and emerging technology he can, and developing his own ideas—and often strong opinions—about it. (At Maker Faire, I noted with interest his enthusiasm for getting a Raspberry Pi… and his long discussion about what it would mean to have Mathematica running on it…)

Christopher has always been an energetic explainer of things. But it was pretty interesting last weekend to see him for the first time “explaining” to a large audience. He was definitely the star of our joint speaking slot. And—despite a few tense moments—it was pretty fun for me to see two progeny of very different kinds—Christopher and Mathematica—work together so nicely.

↧

Latest Perspectives on the Computation Age

October 11, 2012, 9:24 am

≫ Next: Mathematica 9 Is Released Today!

≪ Previous: Kids, Arduinos and Quadricopters

This is an edited version of a short talk I gave last weekend at The Nantucket Project—a fascinatingly eclectic event held on an island that I happen to have been visiting every summer for the past dozen years.

Lots of things have happened in the world in the past 100 years. But I think in the long view of history one thing will end up standing out among all others: this has been the century when the idea of computation emerged.

We’ve seen all sorts of things “get computerized” over the last few decades—and by now a large fraction of people in the world have at least some form of computational device. But I think we’re still only at the very beginning of absorbing the implications of the idea of computation. And what I want to do here today is to talk about some things that are happening, and that I think are going to happen, as a result of the idea of computation.

Word cloud

I’ve been working on this stuff since I was teenager—which is now about a third of a century. And I think I’ve been steadily understanding more and more.

Our computational knowledge engine, Wolfram|Alpha, which was launched on the web about three years ago now, is one of the latest fruits of this understanding.

What it does—many millions of times every day—is to take questions people ask, and try to use the knowledge that it has inside it to compute answers to them. If you’ve used Siri on the iPhone, or a bunch of other services, you’ll probably have seen Wolfram|Alpha answers.

Here’s the basic idea of Wolfram|Alpha: we want to take all the systematic knowledge that’s been accumulated in our civilization, and make it computable. So that if there’s a question that can in principle be answered on the basis of that knowledge, we can just compute the answer.

So how do we do that? Well, one starts off from data about the world. And we’ve been steadily accumulating data from primary sources about thousands of different kinds of things. Cities. Foods. Movies. Spacecraft. Species. Companies. Diseases. Sports. Chemicals. Whatever.

We’ve got a lot of data now, with more flowing in every second. And actually by now our collection of raw structured data is about as big in bytes as the text of all the human-written pages that one can find on the web.

But even all that data on its own isn’t enough. Because most questions require one not just to have the data, but to compute some specific answer from it. You want to know when some satellite is going to fly overhead? Well, we may have recent data about the satellite’s orbit. But we still have to do a bunch of physics and so on to figure when it’s going to be over us.

And so in Wolfram|Alpha a big thing we’ve done is to try to take all those models and methods and algorithms—from science, and technology, and other areas—and just implement them all.

You might be thinking: there’s got be some trick, some master algorithm, that you’re using. Well, no, there isn’t. It’s a huge project. And it involves experts from a zillion different areas. Giving us their knowledge, so we can make it computable.

Actually, even having the knowledge and being able to compute from it isn’t enough. Because we still have to solve the problem of how we communicate with the system. And when one’s dealing with, sort of, any kind of knowledge, any question, there’s only one practical way: we have to use human natural language.

So another big problem we’ve had to solve is how to take those ugly messy utterances that humans make, and turn them into something computable. Actually, I thought this might be just plain impossible. But it turned out that particularly as a result of some science I did—that I’ll talk about a bit later—we made some big breakthroughs.

The result is that when you type to Wolfram|Alpha, or talk to Siri… if you say something that humans could understand, there’s a really good chance we’ll be able to understand it too.

So we can communicate to our system with language. How does it communicate back to us?

What we want to do is to take whatever you ask, and generate the best report we can about it. Don’t just give you one answer, but contextualize that. Organize the information in a way that’s optimized for humans to understand.

All of this, as I say, happens many millions of times every day. And I’m really excited about what it means for the democratization of knowledge.

It used to be that if you want to answer all these kinds of questions, you’d have to go find an expert, and have them figure out the answer. But now in a sense we’ve automated a lot of those experts. So that means anyone, anywhere, anytime, can immediately get answers.

People are used to being able to search for things on the web. But this is something quite different.

We’re not finding web pages where what you’ve asked for was already written down by someone. We’re taking your specific question, and computing for you a specific idea. And in fact most of the questions we see every day never appear on the web; they’re completely new and fresh.

When you search the web, it’s liking asking a librarian a question, and having them hand you a pile of books—well, in this case, links to web pages—to read. What we’re trying to do is to give you an automated research analyst, who’ll instantly generate a complete research report about your question, complete with custom-created charts and graphs and so on.

OK. So this all seems like a pretty huge project. What’s made it possible?

Actually, I’d been thinking about basically this project since I was a kid. But at the beginning I had no idea what decade—or even century—it would become possible. And actually it was a big piece of basic science I did—that I’ll talk about a bit later—that convinced me that actually it might be possible.

I’ve been involved in some big technology projects over the years. But Wolfram|Alpha as a practical matter is by far the most complicated, with the largest number of different kinds of moving parts inside it.

And actually, it builds on something I’ve been working on for 25 years. Which is a system called Mathematica. Which is a computer language. That I guess one could say is these days by far the most algorithmically sophisticated computer language that exists.

Mathematica is the language that Wolfram|Alpha is implemented in. And the point is that in Mathematica, doing something like solving a differential equation is just one command. That’s how we manage to implement all those methods and models and so on. We’re starting from this very sophisticated language we already have.

Wolfram|Alpha is still about 15 million lines of code—in the Mathematica language—though.

Wolfram|Alpha is about knowing everything it can about the world—with all its messiness—and letting humans interact with it quickly using natural language.

Mathematica is about creating a precise computer language, that has built in to it, in a very coherent way, all the kinds of algorithmic functionality that we know about.

Over the past 25 years, Mathematica has become very widely used. There’s broad use on essentially all large university campuses, and all sophisticated corporate R&D operations around the world. And lots and lots of things have been discovered and invented with Mathematica.

In a sense, I see Mathematica as the implementation language for the idea of computation. Wolfram|Alpha is where that idea intersects with the sort of collective accumulation of knowledge that’s happened in our civilization.

So where does one go from here? Lots and lots of places.

First, Wolfram|Alpha is using public knowledge. What happens when we use internal knowledge of some kind?

Over the last couple of years there’ve been lots of custom versions of Wolfram|Alpha created, that take internal knowledge of some company or other organization, combine it with public knowledge, and compute answers.

What’s emerging is something pretty interesting. There’s lots of talk of “big data”. But what about “big answers”?

What one needs to do is to set things up so one makes all that data computable. So it’s possible to just ask a question in natural language, and automatically get answers, and automatically generate the most useful possible reports.

So far this is something that we’ve done as a custom thing for a limited number of large organizations. But we know how to generalize this, and in a sense provide a general way to automatically get analytics done, from data. We actually introduced the first step toward this a few months ago in Wolfram|Alpha.

You can not only ask Wolfram|Alpha questions, but you can also upload data to it. You can upload all kinds of data. Like a spreadsheet, or even an image. And then Wolfram|Alpha’s goal is to automatically tell you something interesting about that data. Or, if you ask a specific question, be able to give a report about the answer.

Right now what we have works rather nicely for decently small lumps of data. We’re gradually increasing to huge quantities of data.

Here’s a kind of fun example that I did. It relates to personal analytics—or what’s sometimes called “quantified self”. I’ve been a data-oriented guy for a long time. So I’ve been collecting all kinds of data about myself. Every email for 23 years. Every keystroke for a dozen years. Every walking step for a bunch of years. And so on. I’ve found these things pretty useful in sort of keeping my life organized and productive.

Earlier this year I thought I’d take all this data I’ve accumulated, and feed it to Mathematica and Wolfram|Alpha. And pretty soon I’m getting all these plots and analyses and so on. Sort of my automated personal historian, showing me all these events and trends in my life and so on.

I have to say that I thought there must be lots of people who were collecting all sorts of data about themselves. But when I wrote about this stuff earlier this year—and it got picked up in all the usual media places—I was pretty surprised to realize that nobody came out and said “I’ve got more data than you”.

So, a little bit embarrassingly, I think I have to conclude that for now, I might be the data-nerdiest—or maybe the most computable—human around. Though we’re working to change that.

Just a few weeks ago, for example, we released Wolfram|Alpha Personal Analytics for Facebook. So people can connect their Facebook accounts to Wolfram|Alpha and immediately get all this analytics about themselves and their friends and so on.

And so far a few million have done this. It’s kind of fun to see peoples’ lives made computable like this. There are all these different friend networks for example. Each one tells a story. And tells one some psychology too.

So we’re talking about making things computable. What can we really make computable? What about a city?

There’s all this data in a city, collected by all sorts of municipal agencies. There’s permits, there’s reports, there’s GIS data. And so on. And if you’re a sophisticated city, you’ve got lots of this data on the web somehow. But it’s in raw form. Where really only an expert can use it.

Well, what if we were about to feed it through the Wolfram|Alpha technology stack? If there’s a question that could be answered about the city on the basis of the data that exists, it’d be able to be answered.

What electric power line gets closest to such-and-such a building? What’s the voltage drop between some point and the nearest substation? Imagine just being able to ask those questions to a mobile phone, and having it automatically compute the answers.

Well, there are a lot of details about actually setting this up in the world, but we now have the technology to do it. To make a computable city. Or, for that matter, to make a computable country. Where all the government data that’s being generated can be set up so we can automatically answer questions from it. Either for the citizens of the country, or for the people who run it. It’ll be interesting what the first computable country is… but from a technology—and workflow—point of view, we’re now ready to do this.

So what else can be computable like this?

Here’s another example: large engineering systems. These days there’s a language called Modelica—yes, it was a Mathematica-inspired name—that’s an open standard for people who create large engineering systems. There used just to be spec sheets for engineering components. Now there are effectively little algorithms that describe each component.

We acquired a company recently that had been using Mathematica for many years to do large-scale systems engineering. And we just a couple of months ago released an integrated systems modeling product, which allows one to take, say, 50,000 components in an airplane, represent them in computable form, and then automatically compute how they’ll behave in some particular situation.

We haven’t yet assembled it all, but we now have the technology stack to do the following: you’ve got some big engineering system in front of you, and maybe it’s sent sensor data back to our servers. Now you talk to your mobile phone and you say “If I push it to 300 rpm, what will happen?” We understand the query, then run a model of the system, then tell you the answer; say “That wouldn’t be a very good idea” (preferably not a HAL voice or something).

So that’s about the operation of engineering systems. What about design?

Well, with everything being computable, it’s easy to run optimization algorithms on designs, or even to search a large space of possible designs. And increasingly what we’ll be doing is stating some design goal, then having the computer automatically figure out how to achieve that goal. It’ll know for example what components are available, with what specifications, and at what cost, and it’ll then figure out how to assemble what’s needed to achieve the design goal. Actually, there’s a much more everyday example of this that will come soon.

In Wolfram|Alpha, for example, we’ve been working with retailers to get data on consumer products. And the future will be to just ask in natural language for some product that meets some requirements, and then automatically to figure out what that is.

Or, more interestingly, to say: “I’m doing such-and-such a piece of home improvement. Figure out how much of what products I need to get to do that.” And the result should be an automatically generated bill of materials, and then the instructions about what to do with them.

There are just all these areas ripe to be made computable. Here’s another one: law.

Actually, back 300 years Leibniz was thinking about that when he first invented some precursors to the modern idea of computation. He imagined having some encoding of human laws, set up so one can ask a machine in effect to automatically figure out: “Is this legal or not?”

Well, today, there are some kinds of contracts that have already been “made computable”. Like contracts for derivative financial instruments and so on. But what if we could make the tax code computable? Or a mortgage computable? Or, more extremely, a patent.

Actually, some contracts like service-level agreements are beginning to become computable, of necessity, because in effect they have to be interpreted by computers in real time. And of course once things become computable, they can be applied in a much more democratized way, without all the experts needed, and so on.

Here’s a completely different area that I think is going to become computable, and actually that we’re planning to spin off a company to do. And that’s medical diagnosis.

When I look at the medical world, and the healthcare system, diagnosis is really a central problem. I mean, if you don’t have the right diagnosis, all the wonderful and expensive treatment in the world isn’t going to help, and actually it’s probably going to hurt.

Well, diagnosis is really hard for humans. I actually think it’s going to turn out not to be so hard for computers. It’s a lot easier for them to know more, and to not get confused about probabilities, and so on.

Of course, it’s a big project. You start off by encoding all those specialized decision trees and so on. But then you go on and grind up the medical literature and figure out what’s in there. Then you get lots of actual patient records—probably at first realistically from outside the US—and start doing analysis on those. There’s a lot about the getting of histories, and the delivery of diagnoses, that actually becomes a lot easier in an automated system.

But, you know, there’s actually something that’s inevitably going to disrupt existing medical diagnosis, and that’s sensor-based medicine. These days there are only a handful of consumer-level medical sensors, like thermometers and things. Very soon there are going to be lots. And—a little bit like the personal analytics I was talking about earlier—people are going to be routinely recording all sorts of medical information about themselves.

And the question is: how is this going to be used in diagnosis? Because when you come in with 10 megabytes of time series, that’s not just a “were you sweating a lot” question. That’s something that will have to be analyzed with an algorithm.

Actually, I think the whole medical process is going to end up being very algorithmic. Because you’ll be analyzing symptoms with algorithms, but then the treatment will also be specified by some algorithm. In fact, even though right now diagnosis is really important, I think in the end that’s sort of going to go away. One will be going straight from the observed data to the algorithm for treatment.

It’s sort of like in finance. You observe some behavior of some stock in the market. And, yes, there are technical traders who’ll start telling you that’s a “head and shoulders pattern” or something. But mostly—at least in the quant world—you’ll just be using an algorithm to decide what to do, and one doesn’t care about the sort of “descriptive diagnosis” of what’s happening.

And in medicine, I expect that the whole computation idea will extend all the way down to the molecules we use as drugs. Today drugs tend to just be molecules that do one particular thing. In the future, I think we’re going to have molecules that each act like little computers, looking around at cells they encounter, and effectively running algorithms to decide how to act.

You know, there’s some very basic questions about medical diagnosis. I like to think of the analogy of software diagnosis. You have a computer. It’s running an operating system. Things happen to it. Eventually all kinds of crud builds up, it starts running slower—and eventually it crashes; it dies.

And of course you can restart it—from the same program, effectively the same “genetic material” giving you the next generation. That’s all pretty analogous to biology. But it’s much less advanced. I mean, we have all those codes for medical conditions; there’s nothing analogous for software.

But in software, unlike in biology, in principle we can monitor every single bit of what’s happening. And we’ve just started doing some experiments trying to understand in a general way, sort of what’s optimal to monitor to do “software diagnosis”, or more interestingly, what do you have to fix on an ongoing basis to effectively “extend the lifespan” of the running program.

OK. So I’m going through lots of areas and talking about how computation affects them. Here’s another completely different one: journalism.

We’re in the interesting position now with Wolfram|Alpha of having by quite a large margin more data feeds—whose meaning we understand—coming into our system than anyone has ever had before. In other words, we sort of have this giant sensory system connected to lots of things in the world.

Now the question is: what’s interesting that’s going on in the world? We see all this data coming in. What’s unexpected? What’s newsworthy? In a sense what we want to create is computational journalism: automatically finding each hour what the “most interesting things happening in the world are”.

You know, in addition to algorithms to just monitor what’s going on, there are also algorithms to predict consequences. It might be solving the equations for the propagation of a tsunami across an ocean. I think we can pretty much do those. Or it might be—and this I’m much less sure will work—figuring out some economic or supply chain model, in kind of the same way that we figure out behavior of large engineering systems. So that we don’t just see raw news, but also compute consequences.

So that’s computation in journalism. What about computation in books? How can those be computational?

Well, actually it’s rather easy. In fact, we started a company a couple of years ago that’s effectively making computational books. It’s called Touch Press. Our first book was an interactive tour of the chemical elements, that conveniently came out the day the iPad shipped, and that showed up in lots and lots of iPad ads. I’m actually surprised there aren’t lots more entrants here. But Touch Press has become by far the most successful publisher of highly interactive ebooks—in effect computational books. And, yes, underneath it’s using pieces of our technology stack, like Mathematica and Wolfram|Alpha. And producing books on all sorts of things. The most recent two being Egyptian pyramids, and Shakespeare’s sonnets.

And actually, from Mathematica we’ve actually built what we call CDF—the Computable Document Format—that lets one systematically define computable documents: documents where there’s interaction and computation going on right in the document.

And from CDF—in addition to all sorts of corporate reports—we’re beginning to see a generation of textbooks that can interactively illustrate their points, perhaps pulling in real-time data too, and that can interactively let students try things out, or test themselves.

There’s actually a lot more to say about how computation relates to the future of education, both in form and content. We’ve been working to define a computer-based math curriculum, that’s kind of what it’s worth teaching in the 21st century, now that for example, a large fraction of US students routinely use Wolfram|Alpha to do their homework every day. It’s actually exciting how much more we can teach now that knowledge and computation have been so much more democratized.

We’re also realizing—particularly with Mathematica—how much it’s possible to teach about computation, and programming, even at very early stages in education.

Some other time, perhaps, I can talk about the thinking we’ve done about how to change the structure of education—in certain ways to de-institutionalize it.

Before I finish I’d like to make sure I say just a tiny bit about what computation means not just about all the practical things I’ve been discussing, but also at a sort of deeper intellectual level. Like in science. Some of you may know that I’ve spent a great many years—in a sense as a user of Mathematica—doing basic science.

My main idea was to depart from essentially 300 years of scientific tradition, that had to do with using mathematical equations to describe the natural world, and instead sort of generalize them to arbitrary computer programs.

Well, my big discovery was that in the universe of possible computer programs, it takes only a very simple program to get incredibly complex behavior.

And I think that’s very important in understanding many systems in nature. Maybe even in giving us a fundamental theory of physics for our whole universe. It also gives us new ways of thinking about things. For philosophy. For understanding systems, and organizations and so on.

Newtonian science gave us notions like momentum and forces and integrals. That we talk about nowadays in all kinds of contexts. The new kind of science gives us notions like computational irreducibility, and computational equivalence, that give us new ways to think about things.

There are also some very practical implications. Like in technology. In a sense, technology is all about taking what exists in the world, and seeing how to harness it for human purposes. Figuring out what good a magnetic material, or a certain kind of gas, is.

In the computational universe, we’ve got all these little programs and algorithms that do all these remarkable things. And now there’s a new kind of technology that we can do. Where we define some goal. Then we search this computational universe for a program that achieves it. In a sense, what this does is to make invention free.

Actually, we’ve used this for example for creating music, and other people have used it in areas like architecture. Using a computer to do creative work. And, if one wants, to do it very efficiently. Making it economical, for example, to do mass customization.

At a very practical level, for more than a decade now we’ve routinely been creating technology not by having human engineers build it up step by step, but instead by searching the computational universe—and finding all this stuff out there that we can harness for technology. It’s pretty interesting. Sometimes what one finds is readily understandable to a human. Sometimes one can verify it works, but it’s really a very non-human solution. Something that no human on their own would have come up with. But something that one just finds out there in the computational universe.

Well, I think this methodology of algorithm discovery—and related methodologies for finding actual structures, for mechanical devices, molecules, and so on—will, I think, inevitably grow in importance. In fact, I’m guessing that within a few decades we’re going to find that there’s more new technology being created by those methods, than by all existing traditional engineering methods put together.

Today, we tend to create in a sense using only simplified computations—because that’s what our existing methods let us work with. But in the future we’re going to be seeing in every aspect of our world much much more that’s visibly doing sophisticated computation.

I want to leave you with the thought that even after everything that’s happened with computers over the past 50 years, we haven’t seen anything yet. Computation is a much stronger concept—and actually my guess is it’s going to be the defining concept for much of the future of human history.

↧

Mathematica 9 Is Released Today!

November 28, 2012, 8:21 am

≫ Next: “What Are You Going to Do Next?” Introducing the Predictive Interface

≪ Previous: Latest Perspectives on the Computation Age

I’m excited to be able to announce that today we’re releasing Mathematica 9—and it’s big! A whole array of new ideas and new application areas… and major advances along a great many algorithmic frontiers.

Next year Mathematica will be 25 years old (and all sorts of festivities are planned!). And in that quarter century we’ve just been building and building. The core principles that we began with have been validated over and over again. And with them we’ve created a larger and larger stack of technology, that allows us to do more and more, and reach further and further.

From the beginning, our goal has been an ambitious one: to cover and automate every area of computational and algorithmic work. Having built the foundations of the Mathematica language, we started a quarter century ago attacking core areas of mathematics. And over the years since then, we have been expanding outward at an ever-increasing pace, conquering one area after another.

As with Wolfram|Alpha, we’ll never be finished. But as the years go by, the scope of what we’ve done becomes more and more immense. And with Mathematica 9 today we are taking yet another huge step.

So what’s new in Mathematica 9? Lots and lots of important things. An amazing range—something for almost everyone. And actually just the very size of it already represents an important challenge. Because as Mathematica grows bigger and bigger, it becomes more and more difficult for one to grasp everything that’s in it.

But in Mathematica 9 there’s an important new idea. We call it the Wolfram Predictive Interface™, and what it does is to automate the process of suggesting at every step what to do next. At the most basic level, when you type, there’s a context-sensitive Input Assistant that knows about all the functions and options of Mathematica. But more important, when you get output, there’s a Suggestions Bar that’s generated, with a series of buttons for top actions you might want to take next. Sometimes these buttons apply individual Mathematica functions, and sometimes they do more complex things, bringing up interactive panels if need be.

Predictive Interface

Experienced software users may be skeptical. They may be thinking: “I’ve seen these kinds of heuristic let-me-help-you systems before; typically they just get in the way”. Well, I’m happy to say that I think with the Predictive Interface in Mathematica 9 we’ve made a breakthrough. Of course, it helps that we have all that experience—and all those query logs—from Wolfram|Alpha. But the result is that even for an experienced Mathematica user like myself, the Predictive Interface really does well, and it makes my use of Mathematica substantially better. And for people new to Mathematica, I think it’ll be a game-changer. Never again will they be left in a “so what do I do next?” state; they’ll always be given suggestions about how to move forward, as well as automatically be shown what’s possible in Mathematica.

There are all sorts of other interface enhancements in Mathematica 9 too. But what about the computational capabilities of Mathematica? What’s new there in Mathematica 9?

Here’s an immediate “crowd pleaser”: social network analysis. There’s now a function in Mathematica that lets you instantly get data from the APIs of popular social networks—that you can then immediately visualize and analyze using all the capabilities of Mathematica, including many new graph theoretical and statistical functions especially added in Mathematica 9 for social networks. A couple of months ago we introduced Wolfram|Alpha Personal Analytics for Facebook—which has become a highly popular service. Now we’re introducing general, programmable, social network analysis in Mathematica—which promises to be very valuable not only for professional data scientists, but also for math and computer science students who want to jump immediately to the frontiers of one of the hottest current areas.

Social network analysis

Over the last 25 years there have been a few requests for new features in Mathematica that just keep on coming in over and over again. One of those is for Mathematica to support units—like centimeters and gigabytes. Early on we created an add-on package that did just fine for simple cases, and that many people have been happily using for ages. But try as we might, we never figured out how to do a true “Mathematica-class” job of supporting units—and so we never built them into the core of the system.

Well, one feature of Wolfram|Alpha is that it includes by far the most complete handling of units ever. I used to think units were comparatively simple. But I now know they’re messy and complicated, not least because to be at all usable in practice, people have to be able to refer to them with all sorts of weird short notations. Now here’s the great thing we realized in Mathematica 9: we can just use Wolfram|Alpha-based free-form linguistics to let people enter units however they feel like. But then we can turn the units into precise symbolic expressions—that we can then support throughout Mathematica, not just in simple arithmetic, but in calculus, visualization, data analysis, and much more. And so, after all these years, in Mathematica 9 we finally have units built in—not as some kind of hack, but in a really clean, streamlined and long-term way.

Units in Mathematica

Mathematica has a vast web of interconnected computational capabilities. And in each new version, we build on what is already there to reach and cover still more areas. And to me a remarkable aspect of this is just how much needs to already exist in Mathematica to be able to reach and successfully cover these new areas. Sometimes there may be some fairly easy way to implement simple examples of a new capability. But to get really good thorough coverage needs the whole stack that we’ve spent 25 years building up.

And whenever we tackle a new area in Mathematica, we try to do it to full depth and breadth. Usually this means we have to figure out all sorts of new ideas and algorithms. And often a whole new way of looking at the area. That typically dramatically clarifies the area, and both makes it accessible to a much wider class of people, and allows it to be successfully used as a long-term building block in the further development of Mathematica.

In Mathematica 9, we’ve covered quite a collection of new areas.

One example, from a “classic” Mathematica area, concerns differential equations. In Wolfram SystemModeler, we handle all sorts of systems described by differential equations. In Mathematica 9, we’ve now built in capabilities for solving differential equations with discontinuities (e.g. for a ball bouncing on a surface), hybrid discrete/continuous equations, and parametric and eigenvalue differential equations. Back in the 1970s I remember writing a Fortran program to solve an eigenvalue version of the Schrödinger equation; now finally in 2012 it’s little more than a one-liner in Mathematica 9.

There are a whole collection of new capabilities in Mathematica 9 around statistical systems and statistical modeling. We’ve been gradually building towards these for several versions. In Mathematica 7 we introduced all sorts of statistical distributions, and all sorts of methods for fitting data. Then in Mathematica 8 we introduced a very clean symbolic formalism for handling probabilities and probability distributions—as well as filling out a wide range of statistical analysis capabilities. And in Mathematica 9 we’re now extending from probability distributions to a full range of random processes.

We’re covering time series, Markov chains, queues, reliability, survival, stochastic differential equations, random graphs, and more. It’s all very clean and very unified. In each case, there’s a symbolic way to represent a model. And then it’s quite beautiful how everything fits together. Say you’ve got some data. In Mathematica 8, you could fit it to some statistical distribution. Now in Mathematica 9 you can use the exact same functions to fit it to a time series model or a differential equation, or whatever.

25 years ago, Mathematica didn’t put much emphasis on statistics. But over the years, we’ve steadily been building out extremely strong and often beyond-state-of-the-art statistical capabilities. And at this point, we’ve covered with great depth and robustness what’s needed for the vast majority of areas where statistical methods are used.

Of course, Mathematica is not an island. We’ve put a lot of emphasis on making sure that it can import and export an immense range of formats. And also that it can communicate with many external programs and systems. So in Mathematica 9, one thing we’ve added is built-in integration with the R statistics language. It’s pretty cool: I think it’s fair to say that it’s now easier to use R from within Mathematica than directly in the R system itself. So if there’s a package that’s been written in R for some specialized statistical task, you can immediately and seamlessly use it inside Mathematica.

Over the last few years, Mathematica has become a major player in the emerging field of data science. And to support that, we’ve been steadily expanding the types of data for which Mathematica has strong built-in support. We first added image processing in Mathematica 7, then enhanced it in Mathematica 8. Now in Mathematica 9, we’re making Mathematica’s already very complete image processing system do still more. There’s a convenient interactive Image Assistant, there’s feature tracking and face detection, there’s support for HDR and color profiles and there’s the ability to do out-of-core processing on very large images.

But probably the single most striking feature of image processing in Mathematica 9 is that it can handle not only 2D images, but also 3D volumetric ones. And in typical Mathematica style, most functions that work on 2D images now just seamlessly work on 3D images too. Beginning back in the early 1980s I used to try to use 3D volume rendering to visualize 3D cellular automata. And finally now what has always been elaborate and painful has become a Mathematica 9 one-liner that executes instantaneously. I’ve also over the years tried quite a few times to manipulate 3D DICOM-style data from MRIs and the like—and it’s always been quite challenging. But now in Mathematica 9 it’s become incredibly easy—and one’s immediately able not just to do visualization, but also to use all sorts of sophisticated analysis methods.

3D cellular automata

Another area that’s come of age in Mathematica 9 is signal processing—now with hundreds of functions for efficiently analyzing and filtering signals. It’s pretty impressive how smoothly it fits into Mathematica. Whether operating on a standard time-domain signal, or audio, or 2D or 3D images—and whether one’s doing visualization, applying continuous or discrete calculus, or doing high-precision or exact computations. And because Mathematica is a symbolic language, it’s immediately possible to represent filters for signal processing in a symbolic way—so they can be designed and manipulated with the full power of the Mathematica system, as well, for example, as being shared with Wolfram SystemModeler.

Mathematica 8 began the introduction of built-in control theory capabilities in Mathematica. Mathematica 9 fills this out, adding PID controllers, time delays and full support for descriptor systems. And of course, all this is fully integrated with signal processing, visualization and everything else in Mathematica, and Wolfram SystemModeler.

The list of new frontiers and new capabilities in Mathematica 9 is long. Two more that have been long in the making are built-in support for vector analysis and for symbolic tensors. In both cases, there have been deep challenges both in algorithms and in design. Indeed, for example, for more than 20 years I’ve been thinking about how to conveniently fit traditional vector analysis notation—with its often implicit reference to coordinate systems—into Mathematica. And it’s interesting that what it’s taken to solve this problem is in a sense a deeper understanding of just what coordinate systems really mean. But the result is that it’s now easy in Mathematica 9 to deal with symbolic vector expressions and with vector calculus in all standard named coordinate systems.

In working with symbolic tensors, I myself have a long history. Indeed, the first large-scale package that I ever wrote for symbolic computation—in 1978—dealt with symbolic tensors. But it’s taken until now for us to understand at the level needed for Mathematica just how really to work with them. A key problem is how to canonicalize products of tensors with contracted indices. I always suspected that there might be really powerful algorithms for this. And indeed there are, based on graph theory. And they’re now fully implemented in Mathematica 9. With the result that computations in general relativity that even recently seemed like major research projects now happen in mere seconds.

Looking down the complete list of what’s new in Mathematica 9, it’s pretty impressive. In addition to major new areas, there are countless extensions and enhancements throughout the system. Whether it’s responding to suggestions from math teachers to conveniently support real-valued cube roots. Or to allow giant data arrays limited only by 64-bit addressing. Or to support programmatic access to password-protected websites, synchronously or asynchronously. Or to support business dates in a streamlined way. Or to add elegantly designed interactive gauges to use in dashboard or for controls. Or to have a systematic framework for adding legends to any kind of plot. Or to make it possible to do enterprise-level distribution of CDF documents that make use of live data.

It’s been two years since we released Mathematica 8, and to me it’s very impressive how many new things have been finished in just those two years. Back with Mathematica 6, we built a framework that allowed us to start growing Mathematica at a much faster rate. And it’s interesting to see the effect of that in the plot below of the growth of the number of functions built into Mathematica. Today as we add higher and higher levels of automation to Mathematica, we are increasingly dealing with “superfunctions” that each cover larger and larger areas of functionality. But even so, we see that the dramatic growth in total number of functions has continued with Mathematica 9.

Mathematica functions over time

I’ve spent nearly half my life so far overseeing the design of Mathematica. And so for me it’s particularly interesting to see two major developments in Mathematica 9 that relate to design. The first is the increasing use of Wolfram|Alpha ideas and functionality in Mathematica, for example in the handling of units. And the second is the arrival of the Predictive Interface, which provides a new level of automation and discoverability in using Mathematica. Already in Mathematica 9 these are important directions. But I expect in the future they’ll give us the flexibility and the new ways of thinking that we need to unlock a whole sequence of spectacular possibilities.

One might think that after so many years and so many versions, it wouldn’t feel much different to be using Mathematica 9 compared to Mathematica 8. But it does. From the very beginning, whether it’s the new updated design, or the Predictive Interface, it’s very clear that Mathematica 9 is something fundamentally sleeker and stronger than anything that’s come before. And for me, what has happened with Mathematica 9—as with previous new versions of Mathematica—is that I quickly start being able to do more things, more quickly. Old programs that took many lines I can now replace with single, more general and easier-to-understand, Mathematica 9 functions. And things that I never got around to doing before I now do, because they’ve become so easy in Mathematica 9.

Being the kind of software company CEO that I am, I’m always using the latest test versions of all our products while they’re under development. But especially with Mathematica, it’s only close to the end that one can typically see the full vision of a new version emerge from all the threads of development that it involves. And so it has been with Mathematica 9. But what we have now is exciting, ground-breaking, and a great pleasure to use. And I am proud to be able to announce that as of today it is available to everyone.

↧

“What Are You Going to Do Next?” Introducing the Predictive Interface

December 6, 2012, 4:29 pm

≫ Next: Welcome, National Museum of Mathematics

≪ Previous: Mathematica 9 Is Released Today!

There aren’t very many qualitatively different types of computer interfaces in use in the world today. But with the release of Mathematica 9 I think we have the first truly practical example of a new kind—the computed predictive interface.

If one’s dealing with a system that has a small fixed set of possible actions or inputs, one can typically build an interface out of elements like menus or forms. But if one has a more open-ended system, one typically has to define some kind of language. Usually this will be basically textual (as it is for the most part for Mathematica); sometimes it may be visual (as for Wolfram SystemModeler).

The challenge is then to make the language broad and powerful, while keeping it as easy as possible for humans to write and understand. And as a committed computer language designer for the past 30+ years, I have devoted an immense amount of effort to this.

But with Wolfram|Alpha I had a different idea. Don’t try to define the best possible artificial computer language, that humans then have to learn. Instead, use natural language, just like humans do among themselves, and then have the computer do its best to understand this. At first, it was not at all clear that such an approach was going to work. But one of the big things we’ve learned from Wolfram|Alpha is with enough effort (and enough built-in knowledge), it can. And indeed two years ago in Mathematica 8 we used what we’d done with Wolfram|Alpha to add to Mathematica the capability of taking free-form natural language input, and automatically generating from it precise Mathematica language code.

But let’s say one’s just got some output from Mathematica. What should one do next? One may know the appropriate Mathematica language input to give. Or at least one may be able to express what one wants to do in free-form natural language. But in both cases there’s a kind of creative act required: starting from nothing one has figure out what to say.

So can we make this easier? The answer, I think, is yes. And that’s what we’ve now done with the Predictive Interface in Mathematica 9.

The concept of the Predictive Interface is to take what you’ve done so far, and from it predict a few possibilities for what you’re likely to want to do next.

Predictive interface

In Mathematica 9, the way this works is that when you have an output in focus, a Suggestions Bar appears below it, with a list of buttons for possible actions to take next. What buttons appear is determined by a computation that is run in real time when you request the Suggestions Bar.

There are two kinds of inputs to this computation. The first is what actions are common given the structure of the output, and the earlier history of your session. And the second is what actions would lead to useful results.

Over time, the Predictive Interface will be able to learn from the actions people take in it. But to get started, we’ve used several large sources. The first is the collection of carefully tuned heuristic algorithms that Wolfram|Alpha uses to determine what pods to output given a particular input. The second is the billions of actual queries that we’ve seen in the Wolfram|Alpha query stream. The third is published Mathematica code—for example the huge number of examples in the Wolfram Documentation Center and the Wolfram Demonstrations Project. And the fourth is our very large internal sample of Mathematica code from the source code of both Mathematica itself and Wolfram|Alpha.

From the query stream and heuristic algorithms in Wolfram|Alpha we learn many a priori probabilities for actions to be taken on different kinds of objects, as well as a certain amount about sequences of actions. From samples of Mathematica code we learn what kinds of actions occur together—for example what the probabilities are for different functions to appear applied to different kinds of arguments.

In a first approximation, what happens in the Predictive Interface is that possible outputs are categorized into hundreds of different general types, each represented by a symbolic expression that encodes certain properties and attributes. Then, informed by our various sources, large numbers of probabilistic rules are set up to determine conceivable actions that might be suggested for particular combinations of types. Which actions actually make sense to suggest will then depend on which ones lead to useful results. So typically what the Predictive Interface does is to try out candidate actions—or tests based on them—and then use heuristics to assess the utility of the results they would give.

The Predictive Interface will often internally find quite a large number of possible actions to suggest. These then have to be ranked so that the best choices can be presented first. And the way this is done is through a fairly elaborate system of scores, that combine specific heuristic algorithms with probabilistic information and assessments of the utility of results.

But after all this sophisticated computation, what the user ultimately sees is just a simple list of buttons for possible actions to take.

Let’s look at an example:

Predictive Interface example

The Predictive Interface is suggesting a few things to do with this integer. And they seem pretty sensible. But actually they’re even better than we might expect. Because the Predictive Interface is using information it gets by actually doing the computations it’s suggesting. Let’s try another integer:

Predictive Interface example

Again the suggestions seem pretty sensible. But they’re different. And the reason is that the Predictive Interface knows that this particular integer is a prime. So it can tell for example that a primality test will be particularly worth doing on it. Sometimes it can be quite uncanny how prescient the Predictive Interface manages to be. But what I’ve found is that when one gets used to this, it’s surprisingly useful not only in its primary purpose of guiding one through what one can do, but also in giving implicit hints about features of what one’s seeing.

It’s interesting to compare the concept of the Predictive Interface with the way Wolfram|Alpha generates reports. In Wolfram|Alpha, given a particular input—like an integer, for example—Wolfram|Alpha will generate a sequence of “pods” which it displays down the page, giving results of computations that its heuristic algorithms determine are interesting. In the Predictive Interface there’s also selection of computations going on, but now not for the purpose of actually displaying the results of these computations. Instead, after all sorts of internal work, all that’s actually displayed is a sequence of buttons, presented for the human user to pick what to do next.

Of course, one can combine these approaches, and when the Predictive Interface determines that it’s going to be useful, it “lights up” the Wolfram|Alpha logo in the Suggestions Bar. If you then press this, you’ll get the whole Wolfram|Alpha result—from which you can pick an individual pod to generate a specific new Mathematica input.

Predictive Interface example

The Predictive Interface makes suggestions that lead to many different kinds of actions. But what we’ve found is that it’s best to present every suggestion in a more or less uniform way—by having one or more plain English words on what is effectively a button. Sometimes pressing the button may just apply some individual Mathematica function, like Solve (for equation solving), or FactorInteger (for integer factorization). And in this case, the next input makes it immediately obvious what was done:

Predictive Interface example

Sometimes the Predictive Interface may end up composing pieces of code on the fly, which again can be applied with individual buttons:

Predictive Interface example

Often these pieces of code are simple enough that it is not distracting to display them completely. But sometimes it’s better to “hide” them by default, with an “opener” provided if one wants to actually look at the code:

Predictive Interface example

And sometimes the code is actually an invocation of Wolfram|Alpha:

Predictive Interface example

The Predictive Interface ranks suggestions, and immediately displays the top few. If you press “more…”, you’ll get a panel, which will typically show many more suggestions, now arranged in categories:

Predictive Interface example

Sometimes there’s just one form of a particular suggestion. But quite often there are alternatives or options. When there are a fairly small number of easy-to-understand choices, the Predictive Interface just lists them by name in a pull-down:

Predictive Interface example

Or, if what’s going on is more complicated, it shows previews of what the result would be for different choices:

Predictive interface example

In many cases, there isn’t just a list of possible choices; instead one may have to “fill in a form” to say what one wants:

Predictive Interface example

In general, the Predictive Interface can present an essentially arbitrary user interface. Internally, it’s just generating a symbolic Mathematica expression, which can then make use of anything in the Mathematica interface language—for example to give a custom-created ”wizard-like” panel:

Predictive Interface example

In simple cases, the Predictive Interface just gives a simple row of suggestions buttons. But it’s fairly common for there to be very different kinds of suggestions depending on what a particular output is supposed to mean. And in such cases we’ve found that instead of mixing suggestions based on different meanings, it’s much better just to pick a default meaning, then offer other meanings as explicit alternatives—analogous to the way Wolfram|Alpha’s “assuming” mechanism works:

Predictive Interface example

When one uses the Predictive Interface, it’s pretty common to end up with a whole string of inputs and outputs. The way the Predictive Interface is set up, each new input contains the previous output. Here’s a sequence, with the explicit Predictive Interface Suggestions Bars for each line added in:

Predictive Interface example

If you want to repeat the sequence of computations here, you typically want to “roll them up” into a single line—which is what the spiral icon in the Suggestions Bar does:

Predictive Interface example

Right now, the Predictive Interface concentrates on making suggestions for single outputs—though it often makes use of context and previous history. In the future, we’re planning to do a lot more on multi-input suggestions, and various forms of refactoring, as well as on full Mathematica programs.

When we started building the Predictive Interface it was not at all clear it was going to end up working out. Previous examples of “suggestions” interfaces had generally not been well received (think for example Microsoft’s “Clippy” intelligent paperclip—which I have to say I always found charming, if not especially useful). But I suspected the problem was not the general idea of providing contextual suggestions, but the way they were being generated and presented.

And the key idea that’s led to our Predictive Interface for Mathematica 9 is to put real computation into figuring out what to suggest. There’s never going to be a complete, precise, algorithm to determine what a person is going to want to do next (though people are often much more predictable than one might expect). And instead, what one needs to do is to have a whole collection of heuristic algorithms that come as close as possible to being able to make a computer “do what I mean”.

I have to say that in past years I was always skeptical about heuristics. Because I thought people would find them very frustrating. When one has a precise language and system like Mathematica, the essence of good design is to make everything completely consistent, so people can readily predict what the system will do in a particular case. But heuristics go in the opposite direction, trying to cover common cases well, but not worrying at all about overall consistency.

But here’s a key point I’ve learned from creating Wolfram|Alpha: if heuristics are done well, with serious computation and knowledge behind them, they actually do work, and people like them very much. Wolfram|Alpha is absolutely full of heuristics: for understanding free-form linguistic input, for deciding what output to generate, etc. And—as it is so often with computer systems—so long as everything “just works”, people never think about the heuristics, never try to deconstruct them, and never notice or get confused by the lack of ultimate consistency.

The Predictive Interface is technically rather different from Wolfram|Alpha. But the notion of having a whole web of heuristics based on serious computation and knowledge is the same. In practice with the Predictive Interface it’s also important that it presents itself with the appropriate level of emphasis: it’s there, and easy to get at if one wants it, but understated enough that it doesn’t visually get in the way if one doesn’t need it.

And I have to say that in my own use of Mathematica it seems to work well. If I know what to do after I get a particular output, I just type the next input without getting distracted by the Predictive Interface. But as soon as I pause even for a moment, I tend to glance at the Predictive Interface. And often it jogs my thinking, and gives me exactly what I want to do next.

I’ve now had a chance to watch a few Mathematica beginners use the Predictive Interface. It seems to work really nicely. It both gives them the satisfying experience of making progress more quickly than ever before, and it gently exposes them to a wider range of capabilities in Mathematica that they might not otherwise discover for a long time.

Mathematica is a big enough system that I don’t think anyone (even myself) can immediately remember everything it does. So that means that particularly if one is using a somewhat unfamiliar part of the system, the Predictive Interface is highly useful. And even in areas of the system that I know well, it’s just faster to press a Predictive Interface button than to type an input.

The Predictive Interface consists of a kind of infinite web of suggestions. And it’s rather fun to try starting with something simple, and seeing just how far one can go simply by following Predictive Interface suggestions. It’s remarkable how quickly one can end up doing some pretty sophisticated things.

One of the big things I’ve always tried to do in Mathematica (and in Wolfram|Alpha and so on) is to automate as much as possible: to make it so that whatever the computer can automatically handle it does automatically handle. In the past, we’ve done a lot on automating the selection of algorithms, automating the way output and visualizations are presented, or interfaces are built, and, in Wolfram|Alpha, automating the way input is interpreted. With the Predictive Interface, we’re attacking yet another “automation frontier”: automating how one chooses what to do next.

From the point of view of interface design, I am finding the Predictive Interface extremely interesting. Previous interfaces—like menus or forms or computer languages or free-form linguistics—make certain kinds of things easy. The Predictive Interface makes a new set of things easy. And as I work on future design for Mathematica—as well as other products of ours—I can already see my thinking changing as a result of the Predictive Interface.

In language design one typically wants to have a minimal number of names and concepts for people to remember. In free-form linguistic input one wants to support whatever people will immediately think of—and it’s barely worth covering things that people will never “think to ask”. But with the Predictive Interface one has a new mechanism. Once people are going in a particular general direction, one can present to them suggestions that let them discover things that they’d never think were there, or were even possible.

This is particularly important for a system like Mathematica that is deep and broad. It’s too easy for people to spend years using just a small part of the system, and never get the benefit of its wider capabilities. But now the Predictive Interface constantly leads people to other parts of the system that they can immediately use and become familiar with.

The Predictive Interface in Mathematica 9 is just the beginning. In time, it will become possible to do still much richer and more sophisticated predictions. Making use not just of information from the current session, but all sorts of history, data and analytics about the user. The direction is in a sense maximal automation. To have the user define a goal, but then have the computer figure out as much as possible about how to achieve that goal. Sometimes the user will be able to specify the goal using natural language or computer language. But often they will not have formulated it completely enough to do so. And instead they’ll just be able to state a general direction to go. At which point the Predictive Interface can take over, making suggestions and letting the user guide the computer in the direction they want to go.

Today the Predictive Interface is available in Mathematica 9. We already have other products in the works that make use of it. And in the future I expect to see sophisticated computed predictive interfaces show up all over the place—defining in a sense a new paradigm for interacting with computers.

↧

Welcome, National Museum of Mathematics

December 17, 2012, 10:57 am

≫ Next: Remembering Richard Crandall (1947-2012)

≪ Previous: “What Are You Going to Do Next?” Introducing the Predictive Interface

I was just in New York City for the grand opening of the National Museum of Mathematics. Yes, there is now a National Museum of Mathematics, right in downtown Manhattan. And it’s really good—a unique and wonderful place. Which I’m pleased to say I’ve been able to help in various ways in bringing into existence over the past 3 years.

Museum of Mathematics logo

Of all companies, ours is probably the one that has been most involved in bringing math to the world (Mathematica, Wolfram|Alpha, Wolfram Demonstrations Project, MathWorld, Computer-Based Math, Wolfram Foundation, …). And for a long time I’ve thought how nice it would be if there were a substantial, physical, “museum of mathematics” somewhere. But until recently I’d sort of assumed that if such a thing were going to exist, I’d have to be the one to make it happen.

A little more than 3 years ago, though, my older daughter picked out of my mail a curious folding geometrical object—which turned out to be an invitation to an event about the creation of a museum of mathematics. At first, it wasn’t clear what kind of museum this was supposed to be. But as soon as we arrived at the event, it started to be much clearer: this was “math as physical experience”. With the centerpiece of the event, for example, being a square-wheeled tricycle that one could ride on a cycloidal “road”—a mathematical possibility that, as it happens, was the subject of some early Mathematica demonstrations.

Behind the museum was a small group led by Glen Whitney, an energetic math PhD and recently “retired” hedge fund quant, who I had never met, but with whom I turned out to have quite a few connections in common. Soon I was involved as a trustee of the fledgling museum, and four other people from our company were on its advisory council. Needless to say, there were many questions and issues. But a prominent early one was what the logo for the museum should be.

What iconic image would best capture mathematics and its character, and be lively enough to connect with the youthful target group for the museum? Our company has had a strong graphic design tradition, and is of course deeply involved in math. So it was natural for me to suggest that perhaps we could make an early contribution to the museum by trying to develop a logo.

I posed the problem at our company, and quickly got a response from Chris Carlson in our User Interfaces group. Chris has been at our company for more than 18 years, and has long had an interest in computable forms (and in fact his PhD in architecture from Carnegie Mellon was on this subject). So perhaps it was not surprising that Chris suggested not just a logo, but a computable “meta-logo”—an infinite family of possible logos, generated by simple rules.

His idea was ultimately quite simple: pick a mathematical symbol and apply a sequence of symmetry transformations to it. But as is so often the case with simple rules, the results can be elaborate and striking:

Museum of Mathematics logo array

His original proposal, developed with members of our Design group (including our longtime art director Jeremy Davis) addressed some of the new possibilities—and issues—presented by the meta-logo. The museum didn’t have to have a single official logo; it could have logos created all the time by its visitors. And the logos didn’t have to be static; they could animate too. Different projects or events could have their own special logos. And the logo itself could serve as a simple puzzle (“what symbol is that?”). And so on. Of course there were issues too. Like would a meta-logo, with its infinite variations, still be recognizable?

But after some discussion at the museum, the decision was made that, yes, the meta-logo was going to work. And indeed its very variability and structure seemed to capture remarkably well some of the most important features of mathematics. Gradually all sorts of lovely possibilities emerged for how a mass-customizable meta-logo could be deployed—from personalized business cards for the staff, to “logo IDs” for visitors, to being able to decorate almost anything with multiple variations of the logo.

And almost three years later there I was a few days ago walking down 26th St. from Fifth Avenue in New York, and I look up to see:

MoMath flags with logos

I reach the museum, and of course the (temporary) main entrance is full of logos:

MoMath temporary entrance with logos

It’s a few hours before the opening gala event. And of course inside it’s a hive of activity. The first thing I see is the logo generator station for visitors. But oops… there are no logos there yet, just lots of Mathematica code on the screen:

Last-minute Mathematica code for MoMath logos

A basic interactive system for generating logos is a tiny amount of Mathematica code. But getting everything exactly right turns out to be quite tricky, and to involve some quite sophisticated mathematics. It’s easy to apply symmetry operations to regions representing font characters. The issue is rendering them. The most obvious thing is to layer them in the order they’re generated. But if the regions overlap, then doing that can break symmetry. And the only way to guarantee to preserve symmetry is to do some intricate computational geometry, breaking up the regions just right.

And with hours to spare, the final touches were finished, and the logo generator was ready. And so was the rest of the museum.

The MoMath logo generator in situ at the opening gala

It’s an impressive two floors of exhibits. Full of inventive ideas about how to make math tangible to all, from middle school (or below) on. I’m quite a connoisseur of both mathematics and museums. But what impresses me most about MoMath is how different its exhibits are from what I’ve seen before. Each one has a unique idea. That I’ve typically never seen before. But that elegantly illustrates some fundamental mathematical principle. And does it in a way that only a physical exhibit can. Like nesting in the “human tree”:

Human tree

Often there are computers involved in the exhibits. But they’re controlled and used in very physical ways—that makes it clear why this has to be a physical museum, not some kind of virtual entity on the web. To me, it’s also very cool that most of the exhibits can be understood at several levels. There’s the basic math that’s being illustrated. And then there’s the “how does that actually work?”, and the “how does one build something to do this?”. Like the non-holographic 3D images (that can’t be captured in a photograph):

3D non-holograms

It wasn’t cheap to make the museum. And to me it’s impressive how much money was raised for the cause of math. To be able to put the museum in the center of Manhattan, and to create such beautifully made exhibits, with elegantly machined pieces—that are nevertheless designed to be tough enough to withstand the onslaught of young users.

When I was a toddler (long, long ago!) I was fortunate enough to live near the Science Museum in London. Which at the time was just opening its first hands-on gallery. That I insisted on being taken to visit almost every day for a whole summer.

I don’t know what my early experiences with those exhibits imprinted on me. But it’s remarkable to see—half a century later—how the advance of technology and all the creativity that’s been put into MoMath has led to such vastly richer exhibits.

In the coming days and weeks, lots of kids—and adults—will get their first taste of the first museum of mathematics ever to exist in the US. No doubt MoMath will become quite a destination for students, tourists—and, I suspect, events of all sorts. People will learn a lot—and have fun. And remember for a long time their ride on the square-wheeled tricycle—much as I still remember a ride I had on a Galapagos tortoise at the London Zoo nearly 50 years ago. (At the MoMath gala, my younger daughter happened to be captured moments after this picture by a photographer from the Wall Street Journal…)

Square-wheeled tricycle

There could have been a Museum of Mathematics in the US long long ago. But there wasn’t. And it’s only now, through the remarkable efforts of a small number of people, that MoMath exists. It’s going to be a great institution, and it’s one more step in the effort that we’ve been so involved with to bring the heritage and promise of mathematics to the 21st century.

↧

Remembering Richard Crandall (1947-2012)

December 30, 2012, 9:49 am

≫ Next: What Should We Call the Language of Mathematica?

≪ Previous: Welcome, National Museum of Mathematics

Richard Crandall liked to call himself a “computationalist”. For though he was trained in physics (and served for many years as a physics professor at Reed College), computation was at the center of his life. He used it in physics, in engineering, in mathematics, in biology… and in technology. He was a pioneer in experimental mathematics, and was associated for many years with Apple and with Steve Jobs, and was proud of having invented “at least 5 algorithms used in the iPhone”. He was also an extremely early adopter of Mathematica, and a well-known figure in the Mathematica community. And when he died just before Christmas at the age of 64 he was hard at work on his latest, rather different, project: an “intellectual biography” of Steve Jobs that I had suggested he call “Scientist to Mr. Jobs”.

I first met Richard Crandall in 1987, when I was developing Mathematica, and he was Chief Scientist at Steve Jobs’s company NeXT. Richard had pioneered using Pascal on Macintoshes to teach scientific computing. But as soon as he saw Mathematica, he immediately adopted it, and for a quarter of a century used it to produce a wonderful range of discoveries and inventions.

He also contributed greatly to Mathematica and its usage. Indeed, even before Mathematica 1.0 in 1988, he insisted on visiting our company to contribute his expertise in numerical evaluation of special functions (his favorites were polylogarithms and zeta-like functions). And then, after the NeXT computer was released, he wrote what may have been the first-ever Mathematica-based app: a “supercalculator” named Gourmet that he said “eats other calculators for breakfast”. A couple of years later he wrote a book entitled Mathematica for the Sciences, that pioneered the use of Mathematica programs as a form of exposition.

Over the years, I interacted with Richard about a great many things. Usually it would start with a “call me” message. And I would get on the phone, never knowing what to expect. And Richard would be talking about his latest result in number theory. Or the latest Apple GPU. Or his models of flu epidemiology. Or the importance of running Mathematica on iOS. Or a new way to multiply very long integers. Or his latest achievements in image processing. Or a way to reconstruct fractal brain geometries.

Richard made contributions—from highly theoretical to highly practical—to a wide range of fields. He was always a little too original to be in the mainstream, with the result that there are few fields where he is widely known. In recent years, however, he was beginning to be recognized for his pioneering work in experimental mathematics, particularly as applied to primes and functions related to them. But he always knew that his work with the greatest immediate significance for the world at large was what he did for Apple behind closed doors.

Richard was born in Ann Arbor, Michigan, in 1947. His father was an actuary who became a sought-after expert witness on complex corporate insurance-fraud cases, and who, Richard told me, taught him “an absolute lack of fear of large numbers”. Richard grew up in Los Angeles, studying first at Caltech (where he encountered Richard Feynman), then at Reed College in Oregon. From there he went to MIT, where he studied the mathematical physics of high-energy particle scattering (Regge theory), and got his PhD in 1973. On the side he became an electronics entrepreneur, working particularly on security systems, and inventing (and patenting) a new type of operational amplifier and a new form of alarm system. After his PhD these efforts led him to New York City, where he designed a computerized fire safety and energy control system used in skyscrapers. As a hobby he worked on quantum physics and number theory—and after moving back to Oregon to work for an electronics company there, he was hired in 1978 at Reed College as a physics professor.

Steve Jobs had ended his short stay at Reed some years earlier, but through his effort to get Reed computerized, Richard got connected to him, and began a relationship that would last the rest of Steve’s life. I don’t know even a fraction of what Richard worked on for NeXT and Apple. For a while he was Apple’s Chief Cryptographer—notably inventing a fast form of elliptic curve encryption. And later on, he was also involved in compression, image processing, touch detection, and many other things.

Through most of this, Richard continued as a practicing physics professor. Early on, he won awards for creating minimal physics experiments (“measure the speed of light on a tabletop with $10 of equipment”). By the mid-1980s, he began to concentrate on using computers for teaching—and increasingly for research. One particular direction that Richard had pursued for many years was to use computers to study properties of numbers, and for example search for primes of particular types. And particularly once he had Mathematica, he got involved in more and more sophisticated number theoretical mathematics, particularly around primes, among other things co-authoring the (Mathematica-assisted) definitive textbook Prime Numbers: A Computational Perspective.

He invented faster methods for doing arithmetic with very long integers, that were instrumental, for example, in early crowdsourced prime discoveries, and that are in fact used today in modified form in Mathematica. And by doing experimental mathematics with Mathematica he discovered a wonderful collection of zeta-function-related results and identities worthy of Ramanujan. He was particularly proud of his algorithms for the fast evaluation of various zeta-like functions (notably polylogarithms and Madelung sums), and indeed earlier this year he sent me the culmination of his 20 years of work on the subject, in the form of a paper dedicated to Jerry Keiper, the founder of the numerics group at Wolfram Research, who died in an accident in 1995, but with whom Richard had worked at length.

Richard was always keen on presentation, albeit in his own somewhat unique way. Through his “industrial algorithm” company Perfectly Scientific, he published a new poster every time a Mersenne prime was discovered. The price of the poster increased with the number of digits, and for convenience his company also sold a watchmaker’s loupe to allow people to read the digits on the posters.

Richard always had a certain charming personal ponderousness to him, his conversation peppered with phrases like “let me commend to your attention”. And indeed as I write this, I find a classic example of over-the-top Richardness in the opening to his Mathematica for the Sciences: “It has been said that the evolution of humankind took a substantial, discontinuous swerve about the time when our forepaws left the ground. Once in the air, our hands were free for ‘other things’. Toolmaking. …”, and eventually, as he explains after his “admittedly conjectural rambling”, computers and Mathematica…

Richard regularly visited Steve Jobs and his family, with his last visit being just a few days before Steve died. He was always deeply impressed by Steve, and frustrated that he felt people didn’t understand the strength of Steve’s intellect. He was disappointed by Walter Isaacson’s highly successful biography of Steve, and had embarked on writing his own “intellectual biography” of Steve. He had years of interesting personal anecdotes about Steve and his interactions with him, but he was adamant that his book should tell “the real story”, about ideas and technology, and should at all costs avoid what he at least considered “gossip”. At first, he was going to try to take himself completely out of the story, but I think I successfully convinced him that with his unique role as “scientist to Steve Jobs”, he had no choice but to be in the story, and indeed to tell his own story along the way.

Richard was in many ways a rather solitary individual. But he always liked talking about his now-15-year-old daughter, whom he would invariably refer to rather formally as “Ellen Crandall”. He had theories about many things, including child rearing, and considered one of his signature quotes to be “the most efficient way to raise an atheist kid is to have a priest for a father”. And indeed as part of the last exchange I had with him just a few weeks before he died, he marveled that his daughter from a “pure blank, white start” … “has suddenly taken up filling giant white poster boards with minutely detailed drawing”.

While his overall health was not perfect, Richard was in many ways still in the prime of his life. He had ambitious plans for the future, in mathematics, in science and in technology, not to mention in writing his biography of Steve Jobs. But a few weeks ago, he suddenly fell ill, and within ten days he died. A life cut off far too soon. But a unique life in which much was invented that would likely never have existed otherwise.

I shall miss Richard’s flow of wonderfully eccentric ideas, as well as the mysterious “call me” messages, and of the late the practically monthly encouragement speech about the importance of having Mathematica on the iPhone. (I’m so sorry, Richard, that we didn’t get it done in time.)

Richard was always imagining what might be possible, then in his unique way doggedly trying to build towards it. Around the world at any time of day or night millions of people are using their iPhones. And unknown to them, somewhere inside, algorithms are running that one can imagine represent a little piece of the soul of that interesting and creative human being named Richard Crandall, now cast in the form of code.

↧

What Should We Call the Language of Mathematica?

February 12, 2013, 11:35 am

≫ Next: Talking about the Computational Future at SXSW 2013

≪ Previous: Remembering Richard Crandall (1947-2012)

At the core of Mathematica is a language. A very powerful symbolic language. Built up with great care over a quarter of a century—and now incorporating a huge swath of knowledge and computation.

Millions and millions of lines of code have been written in this language, for all sorts of purposes. And today—particularly with new large-scale deployment options made possible through the web and the cloud—the language is poised to expand dramatically in usage.

But there’s a problem. And it’s a problem that—embarrassingly enough—I’ve been thinking about for more than 20 years. The problem is: what should the language be called?

Usually on this blog when I discuss our activities as a company, I talk about progress we’ve made, or problems we’ve solved. But today I’m going to make an exception, and talk instead about a problem we haven’t solved, but need to solve.

You might say, “How hard can it be to come up with one name?” In my experience, some names are easy to come up with. But others are really really hard. And this is an example of a really really hard one. (And perhaps the very length of this post communicates some of that difficulty…)

language

Let’s start by talking a little about names in general. There are names like, say, “quark”, that are in effect just random words. And that have to get all their meaning “externally”, by having it explicitly described. But there are others, like “website” for example, that already give a sense of their meaning just from the words or word roots they contain.

I’ve named all sorts of things in my time. Science concepts. Technologies. Products. Mathematica functions. I’ve used different approaches in different cases. In a few cases, I’ve used “random words” (and have long had a Mathematica-based generator of ones that sound good). But much more often I’ve tried to start with a familiar word or words that capture the essence of what I’m naming.

And after all, when we’re naming things related to our company, we already have a “random” base word: “wolfram”. For a while I was a bit squeamish about using it, being that it’s my last name. But in recent years it’s increasingly been the “lexical glue” that holds together the names of most of the things we’re doing.

And so, for example, we have products like Wolfram Finance Platform or Wolfram SystemModeler for professional markets that have that “random” wolfram word, but otherwise try to say more or less directly what they are and what they do.

Wolfram|Alpha is aimed at a much broader audience, and is a more complex case. Because in a short name we need to capture an almost completely new concept. We describe Wolfram|Alpha as a “computational knowledge engine”. But how do we shorten that to a name?

I spent a very long time thinking about it, and eventually decided that we couldn’t really communicate the concept in the name, and instead we should just communicate some of the sense and character of the system. And that was how we ended up with “alpha”: with “alphabet simplicity”, a connection to language, a technical character, a tentative software step, and the first, the top. And I’m happy to say the name has worked out very well.

OK. So what about the language that we’re trying to name? What should it be called?

Well, I’m pretty sure the word “language” should appear in the name, or at least be able to be tacked onto the name. Because if nothing else, what we’ve got really is quintessentially a language: a set of constructs that can be strung together to represent an infinite range of meanings.

Our language, though, works in a somewhat different way from ordinary human natural language—most importantly, because it’s completely executable: as soon as we express something in the language, that immediately gives us a specification for a unique sequence of computational actions that should be taken.

And in this respect, our language is like a typical computer language. But there is a crucial difference, both practical and philosophical. Typical computer languages (like C or Java or Python) have a small collection of simple built-in operations, and then concentrate on ways to organize those operations to build up programs. But in our language—built right into the language—is a huge amount of computation capability and knowledge.

In a typical computer language, there might be libraries that exist for different kinds of computations. But they’re not part of the language, and there’s no guarantee they fit together or can be built on. But in our language, the concept from the very beginning has been to build as much as possible in, to have a coherent structure in which as much is automated as possible. And in practice this means that our language has thousands of carefully designed functions and structures that automate a vast range of computations and deliver knowledge in immediately usable ways.

So while in some aspects of its basic mode of operation our language is similar to typical computer languages, its breadth and content is much more reminiscent of human languages—and in a sense it generalizes and deepens both concepts of language.

But OK, what should it be called? Well, I first started thinking about this outrageously long ago—actually in 1990. The software world was different then, and there were different ways we might have deployed the language back then. But despite having put quite a bit of software engineering work into it, we in the end never released it at all. And the single largest reason for that, embarrassingly enough, was that we just couldn’t come up with a name for it that we liked.

The “default name” that we used in the development process was the M Language, with M presumably short for Mathematica. But I never liked this. It seemed too much like C—a language which I’d used a lot, but whose character and capabilities were utterly different from our language. And particularly given the name “C”, M seemed to suggest a language somehow based on “math”. Yet even at that time—and to a vastly greater extent today—the language is about much much more than math. Yes, it can do math really well. But it’s broad and deep, and can do an immense range of other algorithmic and computational things—and also an increasing range of things related to built-in knowledge.

One might ask why Mathematica is named as it is. Well, that was a difficult naming process too. The original development name for Mathematica was Omega (and there are still filetype registrations for Mathematica based on that). Then there was a brief moment when it was renamed Polymath. Then Technique. And then there were a whole collection of possibilities, captured in this old list:

Possible names for Mathematica

But finally, at the urging of Steve Jobs, we settled on a name that we had originally rejected for being too long: Mathematica. My original conception of the system—as well as the foundations we built for it—went far beyond math. But math was the first really obvious application area—which is why, when Mathematica was first released, we described it as “a system for doing mathematics by computer”.

I’ve always liked Mathematica as a name. And back in 1988 when Mathematica was launched, it introduced in many ways a new type of name for a computer system, with a certain classical stylishness. In the years since, the name Mathematica has been widely imitated (think Modelica, for example). But it’s become clear that for Mathematica itself the name “Mathematica” is in some sense much too narrow—because it gives the idea that all that Mathematica does is math.

For our language we don’t want to have the same kind of problem. We want a name that communicates the generality and breadth of the language, and is not tied to one particular application area or type of usage. We want a name that makes sense when the language is used to do tiny pieces of interactive work, or to create giant enterprise applications, and to be used by seasoned software engineers, or by casual script tweakers, or by kids getting their first introduction to programming.

My personal analytics data show that I’ve been thinking about the problem of naming our language for 23 years—with episodic bursts of activity. As I mentioned, the original internal name was the M Language. More recently the default internal name has been the Wolfram Language.

Back in the early 1990s, one of my favorite ideas was Lingua—the Latin for language (as well, unfortunately, as tongue), analogous to the Latin character of Mathematica. But Lingua just sounded too weird, and the “gwa” was unpronounceable by too many people whose native languages don’t contain that sound. There was some brief enthusiasm for Express (think “expression”, as well as “express train”), but it died quickly.

There were early suggestions from the MathGroup Mathematica community, like Principia, Harmony, Unity and Tongue (in the latter case, a wag pointed out that bugs could be “slips of the tongue”). One summer intern who worked on the language in 1993 was Sergey Brin (later of Google fame); he suggested the name Thema—”the heart of mathematica” (“ma-thema-tica”). My own notes from that time record rather classical-sounding name ideas like Radix, Plurum, Practica and Programos. And in addition to thinking a lot about it myself, I asked linguists, classicists, marketers and poets—as well as a professional naming expert. But somehow every name either said too little or too much, was too “heavy” or too “light”, or for some reason or another just sounded silly. And after more than 20 years, we still don’t have a name we like.

But now, with all the new opportunities that exist for it, we just have to release the language—and to do that we have to solve the problem of its name. Which is why I’ve been thinking hard about it again.

So, what do we want to communicate about the language? First and foremost, as I explained above, it’s not like other languages. In a sense, it’s a new kind of language. It’s computational, but it’s also got intrinsic content: broad knowledge, structures and algorithms built in. It’s a language that’s highly scalable: good for programs ranging from the absolutely tiny to the huge. It’s a very general language, useful for a great many different kinds of domains. It’s a symbolic language with very clear principles, that can describe arbitrary structures as well as arbitrary data. It’s a fusion of many styles of programming, notably functional and pattern based. It’s interactive. And it prides itself on coherence of design, and tries to automate as much as possible of what it does.

At this point, we pretty much have to have “wolfram”—or at least some hint of it—in the name. But it would be nice if there was a good short name or nickname too. We want to communicate that the language is something that we as a company take responsibility for, but also that it will be very widely and often freely available—and not some kind of rare expensive thing.

All right. So an obvious first question is: how are languages typically named? Well, in Wolfram|Alpha, we have data on more than 16,000 human languages, current and former. And, for example, of the 100 with the most speakers, 13% end in -ese (think Japanese), 11% in -ic (think Arabic), 8% in -ian (think Russian), 5% in -ish (think English) and 3% in -ali (think Bengali). (If one looks at more languages, -ian becomes more common, and -an and -yi start to appear often too.) So should our language be called Wolframese, Wolframic, Wolframian, Wolframish or Wolframaic? Or perhaps Wolfese, Wolfic or Wolfish? Or Wolfian or Wolfan or Wolfatic, or the exotic Wolfari or Wolfala? Or a variant like Wolvese or Wolvic? There are some interesting words here, but to me they all sound a bit too much like obscure tribal languages.

OK. So what about computer languages? Well, there’s quite a diversity of names. In rough order of their introduction, some notable languages have been: Fortran, LISP, Algol, COBOL, APL, Simula, SNOBOL, BASIC, PL/1, Logo, Pascal, Forth, C, Smalltalk, Prolog, ML, Scheme, C++, Ada, Erlang, Perl, Haskell, Python, Ruby, Java, JavaScript, PHP, C#, .NET, Clojure, Go.

So how are these names constructed? Some—particularly earlier ones—are abbreviations, like Fortran (“Formula Translation”) and APL (“A Programming Language”). Others are names of people (like Pascal, Ada and Haskell). Others are named for companies, like Erlang (“Ericsson language”) and Go (“Google”). And still others are named in whimsical sequences, like BCPL to B to C (“sea”) to shell to Perl (“pearl”) to Ruby—or just plain whimsically, like Python (“Monty Python”). And these naming trends just continue if one looks at less well-known languages.

There are two important points here: first, it seems like computer languages can be called pretty much anything; unlike for most human languages (which are usually derivative on place names), no special linguistic indicator seems to have emerged for computer languages. And second, the names of computer languages only rarely seem immediately to communicate the special features or aspirations of a given language. Sometimes they refer to computer-language history, but often they just seem like quite random words.

So for us, this suggests that perhaps we should just use our existing “random word”, and call our language the Wolfram Language, or WL—or conceivably in short form just Wolfram.

Or we could start from our “random word” wolfram, and go more whimsical. One possibility that has generated some enthusiasm internally is Wolf. Unfortunately wolves tend to have scary associations—but at least the name Wolf immediately suggests an obvious idea for an icon. And we even already have a possible form for it. Because when we introduced special-character fonts for Mathematica in the mid-1990s, we included a \[Wolf] character that was based on a little iconic drawing of mine. Dressing this up could give quite a striking language icon—that could even appear as a single character in a piece of text.

Wolf logo

There are variants, like WolframCode or WolframScript—or Wolfcode or Wolfscript—but these sound either too obscure or too lightweight. Then there’s the somewhat inelegant WolframLang, or it shorter forms WolfLang and WolfLan, which sound too much like Wolfgang. Then there are names like WolframX and WolfX, but it’s not clear the “X” adds much. Same with WolframQ or WolframL. There’s also WolframPlus (Wolfram+), WolframStar (Wolfram*) or WolframDot. Or Wolfram1 (when’s 2?), WolframCore (remember core memory?) or WolframBase. There are also Greek-letter suffixes, Wolfram|Alpha-style, like Wolfram Omega or Wolfram Lambda (“wolf”, “ram” and “lamb”: too many animals!). Or one could go shorter, like the W Language, but that sounds too much like C.

Of course, if one’s into “wolf whimsical”, there are all kinds of places to go. Wolf backwards is Flow, though that hardly seems appropriate for a language so far from simple flowcharts. And then there are names like Howl and Growl which I can’t take too seriously. If one goes into wolf folklore, there are plenty of words and names—but they seem more suited to the Middle Ages than the future.

One can go classical, but the Latin word for wolf is Lupus, which is also the name of a disease. And the Greek is Lukos [λυκος], which just seems like a random word to modern ears. With different case endings, one gets “differently styled” words. But none of the alternate cases or variants of these words (like Lupum, Lupa or Lukon) are too promising either—though at least I get to use my knowledge of Latin and Greek from when I was a kid to determine that. (And English forms like Lupine are amusing, but don’t make it.)

And in the direction of whimsical, there are also words like Tungsten, the common English name for element 74, whose symbol W stands for “wolfram”, and whose most common ore is wolframite. (And no, it was not discovered by an ancestor of mine.)

How about doing something more scientific? Like searching a space of all possible names, “NKS style”. For example, one can just try adding all possible single letters to “wolfram”, giving such unpromising names as Wolframa, Wolframz and Wolframé. With two letters, one gets things like Wolframos, Wolframix and WolframUp. One can try just appending all possible short words, to get things like WolframHow, WolframWay and WolframArt. And it’s a single line of code in our unnamed language (or Mathematica) to find the distribution of, say, what follows “am” in typical English words—yielding ideas like Wolframsu, Wolframity or the truly unfortunate Wolframble.

But what about going in the other direction, and trying to find word forms that actually relate to what we’re trying to communicate about the language? A common way to make up new but suggestive forms is to go back to classical or Indo-European roots, and then try to build novel combinations or variants of these. And of course if we use an actual word form from a language, we at least know that it survived the natural selection of linguistic evolution.

There was a time in the past where one could have taken almost any Latin or Greek root, and expected it to be understood in educated company (as perhaps cyber- was when it was introduced from the Greek [κυβερνητησ] for steersman or rudder). But in today’s world we pretty much have to limit ourselves to roots which are already at least somewhat familiar from existing words.

And in fact, in the relevant area of “semantic space”, “lexical space” is awfully crowded with rather common words. ”Language”, for example, is lingua (“linguistics”) or sermo (“sermon”) in Latin, and glossa [γλωσσα] (“glossary”) or phone [φωνη] (“telephone”) in Greek. “Computation” is computatio in Latin, and arithmos [αριθμος] (“arithmetic”) or logismos [λογισμος] (“logistics”) in Greek. “Knowledge” is scientia (“science”) or cognitio (“cognition”) in Latin, and episteme [επιστημη] (“epistemology”), mathesis [μαθησις] (“mathematics”) or gnosis [γνωσις] (“diagnosis”) in Greek. “Reasoning” is ratio (“rational”) in Latin, and logos [λογος] (“-ology”) in Greek. And so on.

But what can we form from these kinds of roots? I haven’t been able to find anything terribly appealing. Typically the names are either ugly, or immediately suggest a meaning that is clearly wrong (like Wolframology or Wolfgloss).

One can look at other languages, and indeed if you just type “translate word” into Wolfram|Alpha (and then press More a few times), you can see translations for as many as a few hundred languages. But typically, beyond Indo-European languages, most of the forms that appear seem random to an English speaker. (Bizarrely, for example, the standard transliteration of the word for “wolf” in Chinese is “lang”.)

So where can we go from here? One possible direction is this. We’ve been trying to find a name by modifying or supplementing the word “wolfram”, and expecting that the word “language” will just be added as a suffix. But we need to remember that what we have is really a new kind of language—so perhaps it’s the word “language” that we should be thinking of modifying.

But how? There are various prefixes—usually Greek or Latin—that get added, for example, to scientific words to indicate some kind of extension or “beyondness”: ana-, alto-, dia-, epi-, exa-, exo-, holo-, hyper-, macro-, mega-, meta-, multi-, neo-, omni-, pan-, pleni-, praeter-, poly-, proto-, super-, uber-, ultra- and so on. And from these Wolfram hyperlanguage (WHL?) is perhaps the nicest possibility—though inevitably it sounds a little “hypey”, and is perhaps too reminiscent of hypertext and hyperlinks. (Layering on the Greek and Latin there’s Hyperlingua too.)

Wolfram superlanguage, Wolfram omnilanguage and Wolfram megalanguage all sound strangely “last century”. Wolfram ultralanguage and Wolfram uberlanguage both seem to be “trying a bit too hard”, though Wolfram Ultra (without the “language” at all) is a bit better. Wolfram exolanguage pleasantly shortens to Wolfex, but means the wrong thing (think “exoplanet”). Wolfram epilanguage (or just Wolfram Epi) does better in terms of meaning (think “epistemology”), but sounds very technical.

A rather frustrating case is Wolfram metalanguage (WML). It sounds nice, and in Greek even means more or less the correct thing. But “metalanguage” has already come to have a meaning in English (a language about another language)—and it’s not the meaning we want. Wolfram Meta might be better, but has the same problem.

So, OK, if we can’t make a prefix to the word “language” work, how about just adding a word or phrase between “wolfram” and “language”? Obviously the resulting name is going to be long. But perhaps it’ll have a nice abbreviation or shortening.

One immediate idea is Wolfram Knowledge Language (WKL), but this has the problem of sounding like it might just be a knowledge representation language, not a language that actually incorporates lots of knowledge (as well as algorithms, etc.) More accurate would be Wolfram Knowledge-Based Language (Wolfram KBL), and perhaps whatever the name, “knowledge-based language” could be used as a description.

Another direction is to insert the word “programming”. There’s of course Wolfram Programming Language (WPL). But perhaps better is to start by describing the new kind of programming that our language makes possible—which one might call “hyperprogramming”, or conceivably “metaprogramming”. (“Macroprogramming” might have been nice, but it’s squashed by the old concept of “macros”.) And so conceivably one could have Wolfram Hyperprogramming Language (WolframHL, WolframHPL or WHL) or Wolfram Metaprogramming Language (WML)—or at least one can use “hyperprogramming language” or “metaprogramming language” as description.

OK, so what’s the conclusion? I suppose the most obvious metaconclusion is that getting a name for our language is hard. And the maddening thing is that once we do get a name, my whole 20-year quest will be over incredibly quickly. Perhaps the final name will be one we’ve already considered, but just weren’t thinking about correctly (that’s basically what happened with the name Mathematica). Or perhaps some flash of inspiration will lead to a new great name (which is basically what happened with Wolfram|Alpha).

What should the name be? I’m hoping to get feedback on the ideas I’ve discussed here, as well as to get new suggestions. I must say that as I was writing this post, I was sort of hoping that in the end it would be a waste, and that by explaining the problem, I would solve it myself. But that hasn’t happened. Of course, I’ll be thrilled if someone else just outright suggests a great name that we can use. But as I’ve described, there are many constraints, and what I think is more realistic is for people to suggest frameworks and concepts from which we’ll get an idea that will lead to the final name.

I’m very proud of the language we’ve built over all these years. And I want to make sure that it has a name worthy of it. But once we have a name, we will finally be ready to finish the process of bringing the language to the world—and I’ll be very excited to see all the things that makes possible.

↧

Talking about the Computational Future at SXSW 2013

March 19, 2013, 7:31 am

≫ Next: Data Science of the Facebook World

≪ Previous: What Should We Call the Language of Mathematica?

Last week I gave a talk at SXSW 2013 in Austin about some of the things I’m thinking about these days—including quite a few that I’ve never talked publicly about before. Here’s a video, and a slightly edited transcript:

Video streaming by Ustream

Well, this is a pretty exciting time for me. Because it turns out that a whole bunch of things that I’ve been working on for more than 30 years are all finally converging, in a very nice way. And what I’d like to do here today is tell you a bit about that, and about some things I’ve figured out recently—and about what it all means for our future.

This is going to be a bit of a wild talk in some ways. It’s going to go from pretty intellectual stuff about basic science and so on, to some really practical technology developments, with a few sneak peeks at things I’ve never shown before.

Let’s start from some science. And you know, a lot of what I’ll say today connects back to what I thought at first was a small discovery that I made about 30 years ago. Let me tell you the story.

I started out at a pretty young age as a physicist. Diligently doing physics pretty much the way it had been done for 300 years. Starting from this-or-that equation, and then doing the math to figure out predictions from it. That worked pretty well in some cases. But there were too many cases where it just didn’t work. So I got to wondering whether there might be some alternative; a different approach.

At the time I’d been using computers as practical tools for quite a while—and I’d even created a big software system that was a forerunner of Mathematica. And what I gradually began to think was that actually computers—and computation—weren’t just useful tools; they were actually the main event. And that one could use them to generalize how one does science: to think not just in terms of math and equations, but in terms of arbitrary computations and programs.

So, OK, what kind of programs might nature use? Given how complicated the things we see in nature are, we might think the programs it’s running must be really complicated. Maybe thousands or millions of lines of code. Like programs we write to do things.

But I thought: let’s start simple. Let’s find out what happens with tiny programs—maybe a line or two of code long. And let’s find out what those do. So I decided to do an experiment. Just set up programs like that, and run them. Here’s one of the ones I started with. It’s called a cellular automaton. It consists of a line of cells, each one either black or not. And it runs down the page computing the new color of each cell using the little rule at the bottom there.

Rule 254

OK, so there’s a simple program, and it does something simple. But let’s point our computational telescope out into the computational universe and just look at all simple programs that work like the one here.

Cellular automata rules

Well, we see a bunch of things going on. Often pretty simple. A repeating pattern. Sometimes a fractal. But you don’t have to go far before you see much stranger stuff.

This is a program I call “rule 30“. What’s it doing? Let’s run it a little longer.

Rule 30

That’s pretty complicated. And if we just saw this somewhere out there, we’d probably figure it was pretty hard to make. But actually, it all comes just from that tiny program at the bottom. That’s it. And when I first saw this, it was my sort of little modern “Galileo moment”. I’d seen something through my computational telescope that eventually made me change my whole world view. And made me realize that computation—even as done by a tiny program like the one here—is vastly more powerful and important than I’d ever imagined.

Cellular automata

Well, I’ve spent the past few decades working through the consequences of this. And it’s led me to build a new kind of science, to create all sorts of practical technology, and to make me think about almost everything in a different way. I published a big book about the science about ten years ago. And at the time when the book came out, there was a quite a bit of “paradigm shift turbulence“. But looking back it’s really nice to see how well the science has taken root.

Stephen Wolfram—A New Kind of Science

And for example there are models based on my kinds of simple programs showing up everywhere. After 300 years of being dominated by Newton-style equations and math, the frontiers are definitely now going to simple programs and the new kind of science.

But there’s still one ultimate app out there to be done: to figure out the fundamental theory of physics—to figure out how our whole universe works. It’s kind of tantalizing. We see these very simple programs, with very complex behavior.

Cellular automaton

It makes one think that maybe there’s a simple program for our whole universe. And that even though physics seems to involve more and more complicated equations, that somewhere underneath it all there might just be a tiny little program. We don’t know if things work that way. But if out there in the computational universe of possible programs, the program for our universe is just sitting there waiting to be found, it seems embarrassing not to be looking for it.

Now if there is indeed a simple program for our universe, it’s sort of inevitable that it has to operate kind of underneath our standard notions like space and time and so on. Maybe it’s a little like this.

Network

A giant network of nodes, that make up space a bit like molecules make up the air in this room. Well, you can start just trying possible programs that create such things. Each one is in a sense a candidate universe.

Collection of universes

And when you do this, you can pretty quickly say most of them can’t be our universe. Time stops after an instant. There are an infinite number of dimensions. There can’t be particles or matter. Or other pathologies.

But what surprised me is that you don’t have to go very far in this universe of possible universes before you start finding ones that are very plausible. And that for example seem like they’ll show the standard laws of gravity, and even some features of quantum mechanics. At some level it turns out to be irreducibly hard to work out what some of these candidate universes will do. But it’s quite possible that already caught in our net is the actual program for our universe. The whole thing. All of reality.

Well, if you’d asked me a few years ago what I thought I’d be doing now, I’d probably have said “hunting for our universe”. But fortunately or unfortunately, I got seriously sidetracked. Because I realized that once one starts to understand the idea of computation, there’s just an incredible amount of technology one can build—that’s to me quite fascinating, and that I think is also pretty important for the world. And in fact, right off the bat, there’s a whole new methodology one can use for creating technology.

I mean, we’re used to doing traditional engineering—where we build things up step by step. But out there in the computational universe, we now know that there are all these programs lying around that already do amazing things. So all we have to do is to go out and mine them, and find ones that fit whatever technological purpose we’re trying to achieve.

And actually we’ve been using this kind of automated algorithm discovery for quite some time now. By now Mathematica and Wolfram|Alpha are full of algorithms and programs that no human would ever have come up with, but were just found by systematically searching the computational universe. There’s a lot that can be done like this. Not just for algorithms, but for art, like this, and for physical structures and devices too.

Here’s an important point that comes from the basic science. 75 years ago Alan Turing gave us the idea of universal computation. Which is what showed that software was possible, and eventually launched the whole computer revolution. Well, from the science I’ve done comes what I call the Principle of Computational Equivalence. Which among other things implies that not only are universal computers possible; they’re actually really common out there in the computational universe. Like this is the simplest cellular automaton we know is a universal computer—with that tiny little rule at the bottom there.

Rule 110

And from a very successful piece of crowdscience that we did a few years ago, we know this is the simplest possible universal Turing machine.

The Wolfram 2,3 Turing Machine Research Prize

Tiny things. That we can reasonably expect exist all over the natural world. But that are computationally just as powerful as any computer we can build, or any brain, for example. Which explains, by the way, why so much of nature seems so hard for us to decode.

And actually, this starts to get at some big old questions. Like free will. Or like the nature of intelligence. And one of the things that comes out of the Principle of Computational Equivalence is that there really can’t be something special that is intelligence—it’s all just computation. And that has important consequences for thinking about extraterrestrial intelligence. And also for thinking about artificial intelligence.

For me it was this philosophical breakthrough that led to a very practical piece of technology: Wolfram|Alpha. Ever since I was kid I’d been interested in seeing how to take as much of the knowledge that’s been accumulated in our civilization as possible and make it computable. Somehow make it so that if there’s a question that can be answered on the basis of this knowledge, it can be done automatically.

For years I thought that doing that would require building something like a brain. And every decade or so I would ask myself if it was time yet, and I would conclude that it was just too hard. But finally from the Principle of Computational Equivalence I realized that, no, it all had to be doable just with computation. And that’s how I came to start building Wolfram|Alpha.

I hope you’ve mostly seen Wolfram|Alpha—on the web, in Siri, in apps, or wherever.

The idea is: you ask a question, in natural language, and Wolfram|Alpha tries to compute the answer, and generate a report, using knowledge that it has. At some level, this is an insanely difficult thing to make work. And if we hadn’t managed to do it, I might have thought it was pretty much impossible.

First, you’ve got to get all that data, on all sorts of things in the world. And no, you can’t just forage it from the web. You have to actually go interact with all the primary sources. Really understand the data, with actual human experts. And curate it to the point where it can reliably be used to compute from. And by now I think we’ve got more bytes of raw data inside Wolfram|Alpha than there is meaningful text content on the whole web.

But that’s only the beginning. Most questions people have aren’t answered just by retrieving a piece of data. They need some kind of computation. And for that we’ve had to take all those methods and models and algorithms that come from science and engineering and financial analysis and whatever and implement them. And by now it’s more than ten million lines of very high-level Mathematica code.

So we can compute lots of things. But now we’ve got to know what to compute. And the only realistic way for humans to interface with something this broad is through humans’ natural language. It’s not just keywords; it’s actual pieces of structured language, written or spoken. And understanding that stuff is a classic hard problem.

But we have two secret weapons. First, a bunch of methods from my new kind of science. And second, actual underlying knowledge, a bit like us humans have, that lets us decode and disambiguate.

Over the 3 years since Wolfram|Alpha launched I’m pleased at how far we’ve managed to get. It’s hard work, but now more than 90% of the queries that come to our website we can completely understand. We’ve really cracked the natural language problem, at least for these small snippets.

So once we’ve understood the input, what do we do? Well, what we’ve found is that people almost never want just one answer—42 or whatever. They want a whole custom report built for them. And we’ve developed a methodology now for automatically figuring out what information to present, and how to present it.

Many millions of people use this every day. A few web tourists. An awful lot of students, and professionals, and people wanting to figure all kinds of things out. It’s kind of nice to see how few of the queries we get are things that you can just search for on the web. People are asking us fresh, new, questions whose answers have never been written down before. So the only way to get those answers would be to find a human expert to ask—or to have Wolfram|Alpha compute them. It’s a huge project that I personally expect to keep working on forever.

It’s fascinating of course. Combining all these different areas of human knowledge. Figuring out things like how to curate and make computable human anatomy, or the 3 million or so theorems that exist in the literature of mathematics. I’m quite proud of how far we’ve got already, and how much faster we’re getting at doing things.

And, you know, it’s not just about public knowledge. We’re also now able to bring in uploaded material, and use our algorithms and knowledge to analyze it. We can bring in a picture. And Wolfram|Alpha will tell us things about it.

Image upload with Wolfram|Alpha Pro

And we could explicitly tell Wolfram|Alpha to do some image computation. It works really nicely on a phone. Or we could upload a spreadsheet. And Wolfram|Alpha can use its linguistics to decode what’s in it, and then automatically generate a report about what’s interesting in the data.

Or we could get data from some internal database and ask natural language questions about it. And get custom reports automatically generated that can use external data as well as internal data. It’s incredibly powerful. And actually we have quite a business going building custom versions of Wolfram|Alpha for companies and other organizations.

It’s gradually getting more and more automated, and actually we’re planning to spin off a company specifically to do this kind of thing.

And you know, given the Wolfram|Alpha technology stack, there are so many places to go. Like having Wolfram|Alpha not just generate information, but actually do things too. You tell it something in natural language. And it uses algorithms and knowledge to figure out what to do.

Here’s a sophisticated case. As part of our high-end business, last year we released Wolfram SystemModeler.

Which is a tool for letting one design and simulate complex devices with tens of thousands of components. Like airplanes or turbines. Well, hooking this up to Wolfram|Alpha, we’ll be able to just ask questions to Wolfram|Alpha, and have it go to SystemModeler to automatically simulate a device, and then figure out how to do something.

Here’s a different direction: set Wolfram|Alpha loose on something like a document, where it can use our natural language technology to automatically add computation.

You know, today Wolfram|Alpha operates as an on-demand system: you say something to it, and it’ll respond. But in the future, it’s increasingly going to be used in a preemptive way. It’s going to sense or see something, and it’s automatically going to show you what it thinks you should know. Right now, the main issue that we see in people using Wolfram|Alpha is that they don’t understand all the things it can do. But in this preemptive mode, there’s no issue with that kind of discovery. Wolfram|Alpha is just going to automatically be figuring out what to show people. And once the hardware for augmented reality is there, this is going to be really neat. I mean, within Mathematica we now have what I think is the world’s most powerful image computation system. And combining this with Wolfram|Alpha capabilities, we’re going to be able to do a lot.

I mentioned Mathematica here. It’s sort of our secret weapon. It’s how we’ve managed to do everything we’ve done. Including build that outrageously complex thing that is Wolfram|Alpha. Many of you I hope have heard of Mathematica. This June it’ll be the 25th anniversary of the original release of Mathematica. And I’m proud of how many inventions and discoveries have now been made in the world using Mathematica over that period of time. As well as how many students have been educated with it.

You know, I originally built Mathematica for a kind of selfish reason: I wanted to have it myself. And my goal was to make it broad enough that it could handle sort of any kind of computation I’d ever want to do. My approach was kind of a typical natural-science one. Think about all those different kinds of computations, drill down and try to understand the primitives that lie beneath them, and then implement those primitives in the system. And in a sense my plan was ultimately just to implement anything systematic and algorithmic that could be implemented.

Now I had a very important principle right from the beginning: as the system grew, it must always remain consistent and unified. Every new capability that was added must coherently fit into the structure of the system. And it was a huge amount of work to maintain that kind of design discipline. But I have to say that particularly in the last 10 years or so, it’s unbelievably paid off. Certainly it’s important in letting people learn what’s now a very big system. But even more important is that it’s allowed us to have a very powerful kind of recursive development process, in which anything we add now can “for free” use those huge blocks of functionality that we’ve already built.

The result is that we’ve been covering huge algorithmic areas incredibly fast, and with much more powerful algorithms than have ever been possible before. Actually, a lot of the time we’re really building not just algorithms, but meta-algorithms. Because another big principle we have is that everything should be as automated as possible.

You as a human want to just tell Mathematica what task you’re trying to perform. And there might be 200 different algorithms that could in principle be used. But it’s up to Mathematica to figure out automatically what the best one is. Internally, Mathematica is using very sophisticated algorithms—many of which we’ve invented. But the great thing is that a user doesn’t have to know anything about the details; that’s all handled automatically.

You know, Mathematica has by far the largest set of interconnected algorithmic capabilities that’s ever existed. And it’s not just algorithms that are built in; it’s also knowledge. Because all the knowledge in Wolfram|Alpha is directly accessible, and progressively more closely integrated, in Mathematica. It’s really quite a transformational thing. I call it knowledge-based computing. Whether you’re using the Wolfram|Alpha API or Mathematica, you’re able to do computing in which you can in effect start from the knowledge of the world, and then build from there.

I have to say that I’ve increasingly realized that Mathematica has been rather undersold. People think of it as that great tool for doing math. Which it certainly is. But it’s so much more than that. It was designed that way from the beginning, and as the years go by “math” becomes a smaller and smaller fraction of what the capabilities of Mathematica are about.

Really there are several parts to Mathematica. The most fundamental is the language that Mathematica embodies. It’s ultimately based on the idea that everything can be represented as a symbolic expression. Whether it’s an array of data, an image, a document, a program, an interface, whatever. This is an idea that I had more than 25 years ago—and over the years I’ve gradually realized just how powerful it is: having a small set of primitives that can seamlessly handle all those different kinds of things, and that provides in a sense an elegant “fusion” of many popular modern programming paradigms.

In addition to the symbolic character of the language, there’s another key point. Essentially every other computer language has just a small set of built-in operations. Yes, it has all sorts of mechanisms for handling in a sense the “infrastructure” of programming. But when it comes to algorithms and so on, there’s very little there. Maybe there are libraries, but they’re not unified, and they’re not really part of the language. Well, the point in our language is that all those algorithms are actually built right into the language. And that’s not all, there’s actual knowledge and data also built into the language.

It’s really a new kind of language. Something very different than others. And something incredibly productive for people who use it. But I have to say, in a sense I think it’s been rather hidden all these years. Not that there aren’t millions of people using the language through Mathematica. But there really should be a lot more—including lots who won’t be caught dead doing anything that anyone might think had “math” in it.

Really anyone who’s doing anything algorithmic or computational should be using it. Because it’s inevitably just much more efficient than anything else—because it has so much already built in. So one of the new things that we’re doing is to break out the language that Mathematica is based on, and give it a separate life. We’ve been thinking about this for more than 20 years. But now it’s finally going to happen.

We agonized for a long time about what to call the language. We came up with all kinds of names—clever, whimsical, whatever—and actually just recently on my blog I asked people for their comments and suggestions. And I suppose the result was a little embarrassing. Because after all the effort we put it in, by far the most common response about the name we should use is the most obvious and straightforward one. We should call it the Wolfram Language.

So that’s what it’ll be. The language we’ve built for Mathematica, with that huge network of built-in algorithms and knowledge, will be called the Wolfram Language. It’ll use .wolf files, and of course that means its icon has to be something like this:

Wolfram Language logo

What’s going to happen with this language? Well, here’s where things really get interesting. The language was originally built for the desktop platform that’s the current way most people use Mathematica. But in Wolfram|Alpha, for example, the language is running on a large scale in the cloud. And what’s going to be happening over the next few months is that we’ll be releasing a full cloud version. And not only that, there’ll also be a version running locally on mobile, first under iOS.

Why is that important? Well, it really opens up the language, both its use and its deployment. So, for example, we’re going to have the Wolfram Programming Cloud, in which you can freely write code in the language—anything from a pithy one-liner to something giant—right there in the cloud. And then immediately deploy in all sorts of ways.

If you wanted, you could just run it in an interactive session, like in standard Mathematica. But you can also generate an instant API. That you can call from anywhere, to just seamlessly run code in our cloud. Or you can embed the code in a page, or have the code just run in the background, periodically generating reports or whatever. And then you can take the exact same code, and deploy it on mobile too.

Now something else that we’ve built and refined over the years in Mathematica is our dynamic interface, that uses symbolic expressions to represent controls and interactivity. Not every use of the Wolfram Language uses that interface. But what’s happening is that we’re reinterpreting the interface to optimize it not just for the desktop, but also for the cloud and for mobile.

One place the interface is used big time is in what we call “CDF“: our computable document format. We introduced this a couple of years ago. Underneath it’s Wolfram Language code. On top, it’s a dynamic interactive interface that one can use to make reports and presentations and interactive documents of any kind. Right now, they can be in a plugin in a browser, or they can be standalone on a desktop. What’s happening now is that they can also be on mobile, or, with cloud CDF, they can operate in a pure web page, with no plugin, but just sending every computation to the cloud.

It might sound a bit abstract here. But I think the whole deployment of the Wolfram Language is going to be quite a revolution in programming. There’ve been seeds of this in Mathematica for a quarter of a century. But it’s a kind of convergence of cloud and mobile technology—and frankly our own understanding of the power of what we have—that’s making all this happen now.

You know, the fact that it’s so easy to get so much done in the language is not only important for professional programmers; it’s also really important for kids and anyone else who’s learning to program. Because you don’t have to type much in, and you’re immediately doing serious stuff. And, by the way, you get to learn all those state-of-the-art programming and algorithm concepts right there. And also: there’s an on-ramp that’s easier than anyone’s ever had before, with free-form natural language courtesy of the Wolfram|Alpha engine. It really seems to work very well for this purpose—as we’ve seen in our Mathematica Summer Camp for high-school kids, and our new after-school initiative for middle-school kids.

Maybe I should actually show a demo of all this stuff.

CountryData["SouthAmerica"]

{Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, FalklandIslands, FrenchGuiana, Guyana, Paraguay, Peru, Suriname, Uruguay, Venezuela}

CountryData[#, "Flag"] & /@ %

Flags of South American countries

EdgeDetect /@ %

Edges of flags of South American countries

There is a whole mechanism for deploying these dynamic things using CDF.

One application area that’s fun—and topical these days—is using algorithmic processes to make things that one can 3D-print.

3D printing example

That was the Wolfram Language on the desktop, and CDF. Here it is in the Programming Cloud.

Wolfram Programming Cloud

That’s cloud CDF. This also works on iOS, though the controls look a bit different.

In the next little while, you’ll be seeing a variety of platforms based on our technology. The Document Platform, for creating CDF documents, in the cloud or elsewhere. The Presentation Platform, for creating full computable interactive presentations. The Discovery Platform, optimized for the workflow of discovering things with our technologies.

Many of these involve not just the pure language, but also CDF and our dynamic interface technology. But one important thing that’s just happening now is that the Wolfram Language, with all its capabilities, is starting to fit in some very cheap hardware. Like Raspberry Pi. For years if you wanted to embed algorithms into some device, you’d have to carefully compile them into some low-level language or some such. But here’s the great thing: for the first time, this year, embeddable processors are powerful enough that you can just run the whole Wolfram Language, right on them. So you can be doing your image processing, or your control theory computation, right there, with all the power of everything we’ve built in Mathematica.

By the way, I might say something about devices. The whole landscape of sensors and devices is changing, with everything getting more diverse and more ubiquitous. And one important thing we’re doing is making a general Connections Hub for sensors and devices. In effect we’re curating sensors and devices, and working with lots of manufacturers. So that the data that comes from their systems can seamlessly flow into Wolfram|Alpha, or into anything based on the Wolfram Language. We’re building a generic analytics system that anyone can plug into. It can be used in a fully automatic way, like in Wolfram|Alpha Pro. And it can be arbitrarily customized and programmed, using the Wolfram Language.

By the way, another component of this, primarily for researchers, is that we’re building a general Data Repository. What’s neat here is that because of our Wolfram|Alpha linguistic capabilities, we can automatically read and align data. And then of course we can do analysis. When you read a research paper today, if you’re lucky there’ll be some URL listed where you can find data in some raw form. But with our Data Repository people are going to be able to have genuinely “data-backed papers”. Where anyone can immediately do comparisons or new analysis.

Talking of data, I’ve been a big collector of it personally for a long time. Last year here I showed for the first time some of my nearly 25-year time series of personal analytics data. Here’s the new version.

Plot of every email sent

That’s every email I sent, including this year.

Plot of keystrokes

That’s keystrokes.

Daily rhythms

And that’s my whole average daily rhythm over the past year.

Oh, and here’s something useful I built actually right after South by Southwest last year, that I was embarrassed I didn’t have before: the time series of the number of pending and unanswered emails I have. (It’s computing in real time here in our cloud platform.)

Time series of the number of pending and unanswered emails over the last 30 days

It’s sort of a proxy for busyness level. Which is pretty useful in managing my schedule and so on.

Well, bizarre as it seems to me, I may be the human who’s ended up collecting the most long-term data on themselves of anyone.

But nowadays everyone’s got lots of data on themselves. Like on Facebook, for example. And so in Wolfram|Alpha we recently released Personal Analytics for Facebook. It’ll be coming out in an app soon too. So you can just go to Wolfram|Alpha and ask for a Facebook report, and it’ll generate actually a whole little book about you, combining analysis of your Facebook data with public computational knowledge.

My personal Facebook is a mess, but here’s what the system does on it:

Stephen Wolfram's Facebook report

When we first released our Personal Analytics for Facebook we were absolutely draconian not keeping any data. And no doubt we destroyed some great sociometric science in the making. But a month or so ago we started keeping some anonymized data, and started a Data Donor program, which has been very successful. So now we can explore quite a few things. Like here are a few friend graphs.

Facebook friend graphs

There’s a huge diversity. Each one tells a story. Both about personality and circumstances.

But let’s look at some aggregate information. Like here’s the distribution of the number of friends that people have.

Distribution of the number of Facbeook friends that people have

Like this shows the distributions of ages of friends for a person of a particular age.

Distribution of ages of friends for a person of a particular age

The distribution gets broader with age. Actually, after about age 25, there’s some sort of new law of nature one discovers: that at any age about half of people’s friends are between 25% younger and 25% older.

By the way, in Mathematica and the Wolfram Language there’s also now direct access to social media data for Facebook, LinkedIn, Twitter and so on. So you can do all kinds of interesting analysis and visualization.

Actually, talking of Personal Analytics, here’s a new dimension. I’ve been walking around South by Southwest for a couple of days wearing this cute Memoto camera, which takes a picture every 30 seconds. And last night my 14-year-old was kind enough to write a bit of code to analyze what I got. Here’s what he came up with.

Memoto camera data

You know, it’s pretty neat to see how our big technology stack makes all this possible. I mean, even just to read stuff properly from Facebook we’ve got to be able understand free-form input. Which of course we can with the Wolfram|Alpha Engine. And then to say interesting things we’ve got to use knowledge and algorithms. Then we’ve got to have good automated visualization. And it helps to have state-of-the-art large-graph-manipulation algorithms in the Wolfram Language Engine. And also to have CDF and our Dynamic Interface to generate complete reports.

To me it’s exciting—if a little overwhelming—to see how many things can be moved forward with our technology stack. One big one is education. Of course Wolfram|Alpha and Mathematica are extremely widely used—and well known—in education. And they’re used as central tools in endless courses and so on.

But with our upcoming Cloud Platform lots of new things are going to become possible. And as my way to understand that, I’ve decided it’s time for me to actually make a course or two myself. You know, I was a professor once, before I was a CEO. But it’s been 25 years. Still, I decided the first course to do was one on Data Science. An Introduction to Data Science. I’m having a great time.

Data Science Course example

Data Science is a terrific topic. Really in the modern world everyone should learn it. It’s both immediately useful, and a great way to teach programming, as well as general computational and quantitative thinking.

Between our Cloud Platform and the Wolfram Language, we have a great way to set up the actual course. Here’s the basic setup. Below the video there’s a window where you can just immediately play with all the code that’s shown. And because it’s just very high-level Wolfram Language code it’s realistic to learn in effect just by immersion.

And when it comes to setting up exercises and so on, it’s pretty interesting when you have Wolfram|Alpha-style natural language understanding capabilities and so on. I hope the Data Science will be ready to test in a limited number of months. And, needless to say, it’s all being built with a very automated authoring system, that’ll allow lots of people to make courses like this. I’m thinking about trying to do a math course, for example.

We get asked a lot about math education a lot, of course. And actually we have a non-profit spinoff called Computer-Based Math that’s been trying to create what we see as being a modern computer-informed math curriculum. You see, the current math curriculum was mostly set a century ago, when the world was very different. Two things have changed today: first, we’ve got computers that can automate the mechanical doing of math. And second, there are lots of new and different ways that math gets used in the world at large.

It’s going to be a long process modernizing math education, around the world. We’d been wondering what the first country really to commit to Computer-Based Math would be. Turns out it’s Estonia, which signed up a few weeks ago.

So we’re slowly moving toward people being educated in the kind of computational paradigm. Which is good, because the way I see it, computation is going to become central to almost every field. Let’s talk about two examples—classic professions: law and medicine. It’s funny, when Leibniz was first thinking about computation at the end of the 1600s, the thing he wanted to do was to build a machine that would effectively answer legal questions. It was too early then. But now we’re almost ready, I think, for computational law. Where for example contracts become computational. They explicitly become algorithms that decide what’s possible and what’s not.

You know, some pieces of this have already happened. Like with financial derivatives, like options and futures. In the past these used to just be natural language contracts. But then they got codified and parametrized. So they’re really just algorithms, which of course one can do meta-computations on, which is what has launched a thousand hedge funds, and so on.

Well, eventually one’s going to be able to make computational all sorts of legal things, from mortgages to tax codes to perhaps even patents. Now to actually achieve that, one has to have ways to represent many aspects of the real world, in all its messiness. Which is what the whole knowledge-based computing of Wolfram|Alpha is about.

How about medicine? To me probably the single most important short-term target in medicine is diagnosis. If you get a diagnosis wrong—and an awful lot are wrong in practice—then all the effort and money you spend is going to be wasted, and is often even going to be harmful. Now diagnosis is a difficult thing for humans. And as more is discovered in medicine—and medicine gets more specialized—it gets even more difficult. But I suspect that in fact diagnosis is in some sense not so hard for computers. But it’s a big project to make a credible automated diagnosis system. Because you have to cover everything: it’s no good just doing one particular kind of disease, because then all you’re going to do is say that everyone has it.

By the way, the whole area of diagnosis is about to change—as a result of the arrival of sensor-based medicine. It used to be that you could ask a question or do a test, and the result would be one bit, or one number. But now it’s routine to be able to get lots and lots of data. And if we’re really going to use that data, we’ve got to use computers; humans just don’t deal with that kind of thing. It’s an ambitious project with many pieces, but I think that using our technology stack—and some ideas from science I’ve developed—we know how to do automated medical diagnosis. And we’re actually spinning off a company to do this.

You know, it’s interesting to think about the broad theory of diagnosis. And I think an interesting model for medical diagnosis is software diagnosis—figuring out what’s going wrong with a large running software system. In medicine we have all these standard diagnosis codes. For an operating system one might imagine having things like “diseases of the memory management system” or “diseases of the keyboard driver”. In medicine, we’re starting to be able to measure more and more. But in software we can in principle monitor almost everything. But we need methodologies to interpret what we’re seeing.

By the way, even though I think diagnosis is in the short term a critical point in medicine, I think in the long term it’s simply going to go away. In fact, from my science—as well as the software analogy—I think it’s clear that the idea of discrete diseases is just wrong. Of course, today we have just a few thousand drugs and surgeries we can use. But I think more and more we’ll be using algorithmic treatments. Whether it’s medical devices that behave according to algorithms, or whether it’s even programmable drugs that effectively do a computation at the molecular scale to work out how to act. And once the treatments are algorithmic, we’re really going to want to go directly from data on symptoms to working out the treatment, often adaptively in real time.

My guess is it’s going to end up a bit like a financial portfolio. You watch what the stocks do, and you have algorithms to decide how to respond. And you don’t really need to have a verbal description—like the technical trader’s “head and shoulders” pattern or something—of what the stock chart is doing.

By the way, when you start thinking about medicine in fundamentally computational terms, it gives you a different view of human mortality. It’s like the operating system that’s running, and over the course of time has various kinds of trauma and infections, starts running slower, and eventually crashes, and dies. If we’re going to avoid mortality, we need to understand how to intervene to keep the operating system—or the human—up and running. There are lots of interim steps. Taking over more and more biological functions with technology. And figuring out how to reprogram pieces of the molecular machine that is our body. And figuring out if necessary how to “hit the pause button” to freeze things, presumably with cryonics.

By the way, it’s bizarre how few people work on this. Because I’m sure that, just like cloning, there’s just going to be a wacky procedure that makes it possible—and once we know it, we’re just going to be able to do it quite routinely, and it’s going to be societally very important. But in the end, we want to solve the problem of keeping all the complexity that is a human running indefinitely. There are some fascinating basic science problems here. Connected to concepts like computational irreducibility, and a bit to the traditional halting problem. But I have no doubt that eventually it’ll be solved, and we’ll achieve effective human immortality. And when that happens I expect it’ll be the single biggest discontinuity in human history.

Cellular automata

You know, as one thinks about such things, one can’t help wondering about the general future of the human condition. And here’s something someone like me definitely thinks about. I’m spending my life trying to automate things. Trying to make it possible to do automatically with computation things that humans used to have to do themselves.

Now, if we look at the arc of human history, the biggest systematic change through time is the arrival of more and more technology, and the automation of more and more kinds of tasks. So here’s a question: what if we succeed in automating everything? What will happen then? What will the humans do? There’s an ultimate—almost philosophical—version of this question. And there’s also a practical next-few-decades version.

Let’s start with the ultimate version. As we go on and build more and more technology, what will the end point be? We might assume that we could somehow go on forever, achieving more and more. But the Principle of Computational Equivalence tells us that we cannot. One we have reached a certain level, everything is already in a sense possible. And even though our current engineering has not yet reached this point, the Principle of Computational Equivalence also tells us that this maximal level of computational sophistication is not particularly rare. Indeed it happens in many places in the physical world, as well as in systems like simple cellular automata.

Cellular automata

And it’s not too hard to see that as we improve our technology, getting down to the smallest scales, and removing everything that seems redundant, that we might wind up with something that looks just like a physical process that already happens in nature. So does this mean that in the ultimate future, with all that great automation and technology, all we’ll achieve is just to produce something that’s indistinguishable from zillions of things that already exist in nature?

In some sense, yes. It’s a sort of ultimate Copernicanism: not only is our Earth not the center of the universe, and our bodies not made of something physically unique. But also, what we can achieve and create with our intelligence is not in a fundamental sense different from what nature is already doing.

So is there any meaningful ultimate future for us? The answer is yes. But it’s not about doing some kind of scientific utopian thing, and achieving some ultimate perfect state that’s independent of our history. Rather, it’s about doing things that depend on all those messy details of us humans and our history.

Here’s a way to understand this. Imagine our technology has got us a complete AI sitting in a box on a desk. It can do all sorts of incredible things; all sorts of sophisticated computations. The question is: what will it choose to do? It has no intrinsic way to decide. It needs some kind of goal, some kind of purpose, imposed on it. And that’s where we humans and our history come in. I mean, for humans, there is again no absolute purpose abstractly defined. We get our notion of purpose from the details of our existence and our history. And to achieve ultimate technology is in a sense empty unless purposes are defined for it, and that’s where we humans come in.

We can begin to see this pretty well even right now. In the past, our technology was such that we typically had to define quite explicitly what systems we build should do, say by writing code that defines each step they should take. But today we’ve increasingly got much more capable systems, that can do all kinds of different things. And we interact with them in a sense by injecting purpose. We define a purpose or a goal, and then the system figures out how it can best achieve that goal.

Well, of course, human purposes have evolved quite a bit over the course of human history. And often their evolution is connected to the arrival of technology that makes more things possible. So it’s not too clear what the limit of this kind of co-evolving system will be, and whether it will turn out to be wonderful or terrible. But in the nearer term, we can ask what effect increasing automation will have on people and society. And actually, as I was thinking about this recently, I thought I’d pull together some data about what’s happened with this historically. So here are some plots over the past 150 years of what fractions of people in the US have been in different kinds of occupations. Blue for males; pink for females.

Fractions of people in the US who have been in different kinds of occupations over the last 150 years

There are lots of interesting details here, like the pretty obvious direct and indirect effects of larger government over the last 50 years. But there’s also a clear signature of automation, with a variety of kinds of occupations simply going away. And this will continue. And indeed my expectation is that over the coming years a remarkable fraction of today’s occupations will successfully be automated. In the past, there’ve always been new occupations that took the place of ones that were automated away. And my guess, or perhaps hope, is that for most people some hybrid of avocation and occupation will emerge.

Which brings me to something I’ve been thinking about quite a lot recently. I’m mostly a science, technology and ideas guy. But I happen also to be very interested in people. And over the years I’ve had the good fortune to work with—and mentor—a great many very talented people. But here’s something I’ve noticed. Many people—and young people in particular—have an incredibly difficult time picking a good occupation—or avocation—for themselves. It’s a bit of a puzzle. People have certain sets of talents and interests. And there are certain niches that exist in the world at any given time. The problem is to match a given person with a niche.

Now sometimes people—and I was an example—pick out a pretty clear niche by the time they’re early teenagers. But an awful lot of people don’t. Usually there are two problems. First, people don’t really identify their skills and interests. And second, people don’t know what’s possible to do in the world. And in the end, an awful lot of people pick directions—almost at random—that aren’t in fact very good for them. And I suspect in terms of wasted resources in the world, this is pretty high up there.

You know, I have a kind of optimistic theory—that’s supported by a lot of personal observation—that for almost every person, there’s at least one really good thing they could be doing, that they will find really fulfilling. They may be lucky or unlucky about what value the world places on that thing at a given time in history. But if they can find that thing—and it often isn’t so easy—then it’s great.

Well, needless to say, I’ve been thinking what can be done. I’ve personally worked on the problem many times. With many great results. Although I have to say that almost always I’ve been dealing with highly capable individuals in good circumstances. And I do want to figure out how to generalize, to younger folk and less good circumstances. But whatever happens, there’s a puzzle to solve. A little like medical diagnosis. Requiring understanding the current situation. Then knowing what’s possible. And one of the practical challenges is knowing enough about how the world is evolving, and what new occupations and ways to operate in the world are emerging.

I’m hoping to do more in this direction. I’m also thinking a bunch about the structure of education. If people have an idea what they might like to do, how do they develop in that direction? The current system with college and so on is pretty inflexible. But I think there are better alternatives, that involve effectively doing diverse mentored projects. Which is something we’ve seen very successfully in the summer schools we’ve done over the past decade.

But anyway, with all this discussion about what people should do: that’s a big challenge for someone like me too. Because I’m in this situation where I’ve been building things for 30 years, and now there are just an absurd number of things that what I’ve built makes possible. We’re pursuing a lot of things at our company. But we only have 700 people, which isn’t enough for everything we want to do. I made a decision long ago to have a simple private company, so we could concentrate on the long term, and on what we really wanted to do. And I’m happy to say that for the last quarter century that’s worked out very well. And it’s made possible things like Wolfram|Alpha—that probably nobody but me would ever have been crazy enough to put money into.

But now we’ve just got too many opportunities, and I’ve decided we’re just leaving too many great ideas—and great technology prototypes—on the table. So we’ve been learning how to spin off companies to develop these things. And actually, we have a whole scheme now for setting up an outside fund to invest in spinoffs that we’re doing.

I’ve been used to architecting technical systems. But architecting these kinds of business structures is also pretty interesting. Sort of trying to extend the machine I’ve built for turning ideas into reality. You know, I like to operate by having a whole portfolio of long-range ideas. Which I carry around with me for a long time. Like for Wolfram|Alpha it was more than 30 years. Gradually waiting for the circumstances and the right time to pursue them. And as I said earlier, I would probably be doing my physics project now, if technology opportunities hadn’t got in the way.

Though I have to say that the architecture of that project is tricky too. Because it’s not clear how to fit it into the world. I mean, lots of people, including myself, are incredibly curious about it. But for the physics community it’s a scary, paradigm-breaking, proposition. And it’s going to be an uphill story there.

And the issue for someone like me is: how much does the world really want something like the fundamental theory of physics done? It’s always great feedback for me doing projects where people really like the results. I don’t know about this one. I’ve been thinking about trying to find out by putting up a Kickstarter project or something for finding the fundamental theory of physics. It’s kind of funny how one goes from that level of practicality, to thinking about the structure of our whole universe. It’s fun—and to me—it’s invigorating.

Well, there are lots more things it’d be fun to talk about. But let me stop here, and hope that you’ve enjoyed hearing a little about what’s going on these days in my small corner of the world.

↧

Data Science of the Facebook World

April 24, 2013, 11:25 am

≫ Next: Dropping In on Gottfried Leibniz

≪ Previous: Talking about the Computational Future at SXSW 2013

More than a million people have now used our Wolfram|Alpha Personal Analytics for Facebook. And as part of our latest update, in addition to collecting some anonymized statistics, we launched a Data Donor program that allows people to contribute detailed data to us for research purposes.

A few weeks ago we decided to start analyzing all this data. And I have to say that if nothing else it’s been a terrific example of the power of Mathematica and the Wolfram Language for doing data science. (It’ll also be good fodder for the Data Science course I’m starting to create.)

We’d always planned to use the data we collect to enhance our Personal Analytics system. But I couldn’t resist also trying to do some basic science with it.

I’ve always been interested in people and the trajectories of their lives. But I’ve never been able to combine that with my interest in science. Until now. And it’s been quite a thrill over the past few weeks to see the results we’ve been able to get. Sometimes confirming impressions I’ve had; sometimes showing things I never would have guessed. And all along reminding me of phenomena I’ve studied scientifically in A New Kind of Science.

So what does the data look like? Here are the social networks of a few Data Donors—with clusters of friends given different colors. (Anyone can find their own network using Wolfram|Alpha—or the SocialMediaData function in Mathematica.)

So a first quantitative question to ask is: How big are these networks usually? In other words, how many friends do people typically have on Facebook? Well, at least for our users, that’s easy to answer. The median is 342—and here’s a histogram showing the distribution (there’s a cutoff at 5000 because that’s the maximum number of friends for a personal Facebook page):

distribution of number of friends for our users

But how typical are our users? In most respects—so far as we can tell—they seem pretty typical. But there are definitely some differences. Like here’s the distribution of the number of friends not just for our users, but also for their friends (there’s a mathematical subtlety in deriving this that I’ll discuss later):

distribution of number of friends for users+friends

And what we see is that in this broader Facebook population, there are significantly more people who have almost no Facebook friends. Whether such people should be included in samples one takes is a matter of debate. But so long as one looks at appropriate comparisons, aggregates, and so on, they don’t seem to have a huge effect. (The spike at 200 friends probably has to do with Facebook’s friend recommendation system.)

So, OK. Let’s ask for example how the typical number of Facebook friends varies with a person’s age. Of course all we know are self-reported “Facebook ages”. But let’s plot how the number of friends varies with that age. The solid line is the median number of friends; successive bands show successive octiles of the distribution.

number of friends vs. age

After a rapid rise, the number of friends peaks for people in their late teenage years, and then declines thereafter. Why is this? I suspect it’s partly a reflection of people’s intrinsic behavior, and partly a reflection of the fact that Facebook hasn’t yet been around very long. Assuming people don’t drop friends much once they’ve added them one might expect that the number of friends would simply grow with age. And for sufficiently young people that’s basically what we see. But there’s a limit to the growth, because there’s a limit to the number of years people have been on Facebook. And assuming that’s roughly constant across ages, what the plot suggests is that people add friends progressively more slowly with age.

But what friends do they add? Given a person of a particular age, we can for example ask what the distribution of ages of the person’s friends is. Here are some results (the jaggedness, particularly at age 70, comes from the limited data we have):

friend ages for people of different ages

And here’s an interactive version, generated from CDF:

The first thing we see is that the ages of friends always peak at or near the age of the person themselves—which is presumably a reflection of the fact that in today’s society many friends are made in age-based classes in school or college. For younger people, the peak around the person’s age tends to be pretty sharp. For older people, the distribution gets progressively broader.

We can summarize what happens by plotting the distribution of friend ages against the age of a person (the solid line is the median age of friends):

median age of friends vs. age

There’s an anomaly for the youngest ages, presumably because of kids under 13 misreporting their ages. But apart from that, we see that young people tend to have friends who are remarkably close in age to themselves. The broadening as people get older is probably associated with people making non-age-related friends in their workplaces and communities. And as the array of plots above suggests, by people’s mid-40s, there start to be secondary peaks at younger ages, presumably as people’s children become teenagers, and start using Facebook.

So what else can one see about the trajectory of people’s lives? Here’s the breakdown according to reported relationship status as a function of age:

$relationship status fractions vs. age$

And here’s more detail, separating out fractions for males and females (“married+” means “civil union”, “separated”, “widowed”, etc. as well as “married”):

$relationship status fractions vs. age$

There’s some obvious goofiness at low ages with kids (slightly more often girls than boys) misreporting themselves as married. But in general the trend is clear. The rate of getting married starts going up in the early 20s—a couple of years earlier for women than for men—and decreases again in the late 30s, with about 70% of people by then being married. The fraction of people “in a relationship” peaks around age 24, and there’s a small “engaged” peak around 27. The fraction of people who report themselves as married continues to increase roughly linearly with age, gaining about 5% between age 40 and age 60—while the fraction of people who report themselves as single continues to increase for women, while decreasing for men.

I have to say that as I look at the plots above, I’m struck by their similarity to plots for physical processes like chemical reactions. It’s as if all those humans, with all the complexities of their lives, still behave in aggregate a bit like molecules—with certain “reaction rates” to enter into relationships, marry, etc.

Of course, what we’re seeing here is just for the “Facebook world”. So how does it compare to the world at large? Well, at least some of what we can measure in the Facebook world is also measured in official censuses. And so for example we can see how our results for the fraction of people married at a given age compare with results from the official US Census:

$fraction married vs. age$

I’m amazed at how close the correspondence is. Though there are clearly some differences. Like below age 20 kids on Facebook are misreporting themselves as married. And on the older end, widows are still considering themselves married for purposes of Facebook. For people in their 20s, there’s also a small systematic difference—with people on Facebook on average getting married a couple of years later than the Census would suggest. (As one might expect, if one excludes the rural US population, the difference gets significantly smaller.)

Talking of the Census, we can ask in general how our Facebook population compares to the US population. And for example, we find, not surprisingly, that our Facebook population is heavily weighted toward younger people:

population vs. age

OK. So we saw above how the typical number of friends a person has depends on age. What about gender? Perhaps surprisingly, if we look at all males and all females, there isn’t a perceptible difference in the distributions of number of friends. But if we instead look at males and females as a function of age, there is a definite difference:

number of friends vs. age

Teenage boys tend to have more friends than teenage girls, perhaps because they are less selective in who they accept as friends. But after the early 20s, the difference between genders rapidly dwindles.

What effect does relationship status have? Here’s the male and female data as a function of age:

median number of friends vs. age

In the older set, relationship status doesn’t seem to make much difference. But for young people it does. With teenagers who (mis)report themselves as “married” on average having more friends than those who don’t. And with early teenage girls who say they’re “engaged” (perhaps to be able to tag a BFF) typically having more friends than those who say they’re single, or just “in a relationship”.

Another thing that’s fairly reliably reported by Facebook users is location. And it’s common to see quite a lot of variation by location. Like here are comparisons of the median number of friends for countries around the world (ones without enough data are left gray), and for states in the US:

median number of friends by location

There are some curious effects. Countries like Russia and China have low median friend counts because Facebook isn’t widely used for connections between people inside those countries. And perhaps there are lower friend counts in the western US because of lower population densities. But quite why there are higher friend counts for our Facebook population in places like Iceland, Brazil and the Philippines—or Mississippi—I don’t know. (There is of course some “noise” from people misreporting their locations. But with the size of the sample we have, I don’t think this is a big effect.)

In Facebook, people can list both a “hometown” and a “current city”. Here’s how the probability that these are in the same US state varies with age:

percentage who moved states vs. age

What we see is pretty much what one would expect. For some fraction of the population, there’s a certain rate of random moving, visible here for young ages. Around age 18, there’s a jump as people move away from their “hometowns” to go to college and so on. Later, some fraction move back, and progressively consider wherever they live to be their “hometown”.

One can ask where people move to and from. Here’s a plot showing the number of people in our Facebook population moving between different US states, and different countries:

migration between US states

There’s a huge range of demographic questions we could ask. But let’s come back to social networks. It’s a common observation that people tend to be friends with people who are like them. So to test this we might for example ask whether people with more friends tend to have friends who have more friends. Here’s a plot of the median number of friends that our users have, as a function of the number of friends that they themselves have: median friend count vs. friend count

And the result is that, yes, on average people with more friends tend to have friends with more friends. Though we also notice that people with lots of friends tend to have friends with fewer friends than themselves.

And seeing this gives me an opportunity to discuss a subtlety I alluded to earlier. The very first plot in this post shows the distribution of the number of friends that our users have. But what about the number of friends that their friends have? If we just average over all the friends of all our users, this is how what we get compares to the original distribution for our users themselves:

distribution of number of friends

It seems like our users’ friends always tend to have more friends than our users themselves. But actually from the previous plot we know this isn’t true. So what’s going on? It’s a slightly subtle but general social-network phenomenon known as the “friendship paradox”. The issue is that when we sample the friends of our users, we’re inevitably sampling the space of all Facebook users in a very non-uniform way. In particular, if our users represent a uniform sample, any given friend will be sampled at a rate proportional to how many friends they have—with the result that people with more friends are sampled more often, so the average friend count goes up.

It’s perfectly possible to correct for this effect by weighting friends in inverse proportion to the number of friends they have—and that’s what we did earlier in this post. And by doing this we determine that in fact the friends of our users do not typically have more friends than our users themselves; instead their median number of friends is actually 229 instead of 342.

It’s worth mentioning that if we look at the distribution of number of friends that we deduce for the Facebook population, it’s a pretty good fit to a power law, with exponent -2.8. And this is a common form for networks of many kinds—which can be understood as the result of an effect known as “preferential attachment”, in which as the network grows, nodes that already have many connections preferentially get more connections, leading to a limiting “scale-free network” with power-law features.

But, OK. Let’s look in more detail at the social network of an individual user. I’m not sufficiently diligent on Facebook for my own network to be interesting. But my 15-year-old daughter Catherine was kind enough to let me show her network:

social network

There’s a dot for each of Catherine’s Facebook friends, with connections between them showing who’s friends with whom. (There’s no dot for Catherine herself, because she’d just be connected to every other dot.) The network is laid out to show clusters or “communities” of friends (using the Wolfram Language function FindGraphCommunities). And it’s amazing the extent to which the network “tells a story”. With each cluster corresponding to some piece of Catherine’s life or history.

Here’s a whole collection of networks from our Data Donors:

No doubt each of these networks tells a different story. But we can still generate overall statistics. Like, for example, here is a plot of how the number of clusters of friends varies with age (there’d be less noise if we had more data):

mean number of clusters vs. age

Even at age 13, people typically seem to have about 3 clusters (perhaps school, family and neighborhood). As they get older, go to different schools, take jobs, and so on, they accumulate another cluster or so. Right now the number saturates above about age 30, probably in large part just because of the limited time Facebook has been around.

How big are typical clusters? The largest one is usually around 100 friends; the plot below shows the variation of this size with age:

median size of largest cluster vs. age

And here’s how the size of the largest cluster as a fraction of the whole network varies with age:

relative size of largest cluster vs. age

What about more detailed properties of networks? Is there a kind of “periodic table” of network structures? Or a classification scheme like the one I made long ago for cellular automata?

The first step is to find some kind of iconic summary of each network, which we can do for example by looking at the overall connectivity of clusters, ignoring their substructure. And so, for example, for Catherine (who happened to suggest this idea), this reduces her network to the following “cluster diagram”:

cluster diagram of social network

Doing the same thing for the Data Donor networks shown above, here’s what we get:

In making these diagrams, we’re keeping every cluster with at least 2 friends. But to get a better overall view, we can just drop any cluster with, say, less than 10% of all friends—in which case for example Catherine’s cluster diagram becomes just:

cluster diagram after clusters with less than 10% of friends were dropped

And now for example we can count the relative numbers of different types of structures that appear in all the Data Donor networks:

Bar chart of different types of clustered social networks

And we can look at how the fractions of each of these structures vary with age:

community graph makeup vs. age

What do we learn? The most common structures consist of either two or three major clusters, all of them connected. But there are also structures in which major clusters are completely disconnected—presumably reflecting facets of a person’s life that for reasons of geography or content are also completely disconnected.

For everyone there’ll be a different detailed story behind the structure of their cluster diagram. And one might think this would mean that there could never be a general theory of such things. At some level it’s a bit like trying to find a general theory of human history, or a general theory of the progression of biological evolution. But what’s interesting now about the Facebook world is that it gives us so much more data from which to form theories.

And we don’t just have to look at things like cluster diagrams, or even friend networks: we can dig almost arbitrarily deep. For example, we can analyze the aggregated text of posts people make on their Facebook walls, say classifying them by topics they talk about (this uses a natural-language classifier written in the Wolfram Language and trained using some large corpora):

topics discussed on Facebook

Each of these topics is characterized by certain words that appear with high frequency:

word clouds for topics discussed on Facebook

And for each topic we can analyze how its popularity varies with (Facebook) age:

topics discussed on Facebook

It’s almost shocking how much this tells us about the evolution of people’s typical interests. People talk less about video games as they get older, and more about politics and the weather. Men typically talk more about sports and technology than women—and, somewhat surprisingly to me, they also talk more about movies, television and music. Women talk more about pets+animals, family+friends, relationships—and, at least after they reach child-bearing years, health. The peak time for anyone to talk about school+university is (not surprisingly) around age 20. People get less interested in talking about “special occasions” (mostly birthdays) through their teens, but gradually gain interest later. And people get progressively more interested in talking about career+money in their 20s. And so on. And so on.

Some of this is rather depressingly stereotypical. And most of it isn’t terribly surprising to anyone who’s known a reasonable diversity of people of different ages. But what to me is remarkable is how we can see everything laid out in such quantitative detail in the pictures above—kind of a signature of people’s thinking as they go through life.

Of course, the pictures above are all based on aggregate data, carefully anonymized. But if we start looking at individuals, we’ll see all sorts of other interesting things. And for example personally I’m very curious to analyze my own archive of nearly 25 years of email—and then perhaps predict things about myself by comparing to what happens in the general population.

Over the decades I’ve been steadily accumulating countless anecdotal “case studies” about the trajectories of people’s lives—from which I’ve certainly noticed lots of general patterns. But what’s amazed me about what we’ve done over the past few weeks is how much systematic information it’s been possible to get all at once. Quite what it all means, and what kind of general theories we can construct from it, I don’t yet know.

But it feels like we’re starting to be able to train a serious “computational telescope” on the “social universe”. And it’s letting us discover all sorts of phenomena. That have the potential to help us understand much more about society and about ourselves. And that, by the way, provide great examples of what can be achieved with data science, and with the technology I’ve been working on developing for so long.

↧

Dropping In on Gottfried Leibniz

May 14, 2013, 10:57 am

≫ Next: There Was a Time before Mathematica…

≪ Previous: Data Science of the Facebook World

I’ve been curious about Gottfried Leibniz for years, not least because he seems to have wanted to build something like Mathematica and Wolfram|Alpha, and perhaps A New Kind of Science as well—though three centuries too early. So when I took a trip recently to Germany, I was excited to be able to visit his archive in Hanover.

Leafing through his yellowed (but still robust enough for me to touch) pages of notes, I felt a certain connection—as I tried to imagine what he was thinking when he wrote them, and tried to relate what I saw in them to what we now know after three more centuries:

Some things, especially in mathematics, are quite timeless. Like here’s Leibniz writing down an infinite series for √2 (the text is in Latin):

Example of Leibniz writing down an infinite series for Sqrt[2]

Or here’s Leibniz try to calculate a continued fraction—though he got the arithmetic wrong, even though he wrote it all out (the Π was his earlier version of an equal sign):

$Leibniz calculating a continued fraction$

Or here’s a little summary of calculus, that could almost be in a modern textbook:

But what was everything else about? What was the larger story of his work and thinking?

I have always found Leibniz a somewhat confusing figure. He did many seemingly disparate and unrelated things—in philosophy, mathematics, theology, law, physics, history, and more. And he described what he was doing in what seem to us now as strange 17th century terms.

But as I’ve learned more, and gotten a better feeling for Leibniz as a person, I’ve realized that underneath much of what he did was a core intellectual direction that is curiously close to the modern computational one that I, for example, have followed.

Gottfried Leibniz was born in Leipzig in what’s now Germany in 1646 (four years after Galileo died, and four years after Newton was born). His father was a professor of philosophy; his mother’s family was in the book trade. Leibniz’s father died when Leibniz was 6—and after a 2-year deliberation on its suitability for one so young, Leibniz was allowed into his father’s library, and began to read his way through its diverse collection of books. He went to the local university at age 15, studying philosophy and law—and graduated in both of them at age 20.

Even as a teenager, Leibniz seems to have been interested in systematization and formalization of knowledge. There had been vague ideas for a long time—for example in the semi-mystical Ars Magna of Ramon Llull from the 1300s—that one might be able to set up some kind of universal system in which all knowledge could be derived from combinations of signs drawn from a suitable (as Descartes called it) “alphabet of human thought”. And for his philosophy graduation thesis, Leibniz tried to pursue this idea. He used some basic combinatorial mathematics to count possibilities. He talked about decomposing ideas into simple components on which a “logic of invention” could operate. And, for good measure, he put in an argument that purported to prove the existence of God.

As Leibniz himself said in later years, this thesis—written at age 20—was in many ways naive. But I think it began to define Leibniz’s lifelong way of thinking about all sorts of things. And so, for example, Leibniz’s law graduation thesis about “perplexing legal cases” was all about how such cases could potentially be resolved by reducing them to logic and combinatorics.

Leibniz was on a track to become a professor, but instead he decided to embark on a life working as an advisor for various courts and political rulers. Some of what he did for them was scholarship, tracking down abstruse—but politically important—genealogy and history. Some of it was organization and systematization—of legal codes, libraries and so on. Some of it was practical engineering—like trying to work out better ways to keep water out of silver mines. And some of it—particularly in earlier years—was “on the ground” intellectual support for political maneuvering.

One such activity in 1672 took Leibniz to Paris for four years—during which time he interacted with many leading intellectual lights. Before then, Leibniz’s knowledge of mathematics had been fairly basic. But in Paris he had the opportunity to learn all the latest ideas and methods. And for example he sought out Christiaan Huygens, who agreed to teach Leibniz mathematics—after he succeeded in passing the test of finding the sum of the reciprocals of the triangular numbers.

Over the years, Leibniz refined his ideas about the systematization and formalization of knowledge, imagining a whole architecture for how knowledge would—in modern terms—be made computational. He saw the first step as being the development of an ars characteristica—a methodology for assigning signs or symbolic representations to things, and in effect creating a uniform “alphabet of thought”. And he then imagined—in remarkable resonance with what we now know about computation—that from this uniform representation it would be possible to find “truths of reason in any field… through a calculus, as in arithmetic or algebra”.

He talked about his ideas under a variety of rather ambitious names like scientia generalis (“general method of knowledge”), lingua philosophica (“philosophical language”), mathematique universelle (“universal mathematics”), characteristica universalis (“universal system”) and calculus ratiocinator (“calculus of thought”). He imagined applications ultimately in all areas—science, law, medicine, engineering, theology and more. But the one area in which he had clear success quite quickly was mathematics.

To me it’s remarkable how rarely in the history of mathematics that notation has been viewed as a central issue. It happened at the beginning of modern mathematical logic in the late 1800s with the work of people like Gottlob Frege and Giuseppe Peano. And in recent times it’s happened with me in my efforts to create Mathematica and the Wolfram Language. But it also happened three centuries ago with Leibniz. And I suspect that Leibniz’s successes in mathematics were in no small part due to the effort he put into notation, and the clarity of reasoning about mathematical structures and processes that it brought.

When one looks at Leibniz’s papers, it’s interesting to see his notation and its development. Many things look quite modern. Though there are charming dashes of the 17th century, like the occasional use of alchemical or planetary symbols for algebraic variables:

Example of Leibniz's use of alchemical or planetary symbols for algebraic variables

There’s Π as an equals sign instead of =, with the slightly hacky idea of having it be like a balance, with a longer leg on one side or the other indicating less than (“<”) or greater than (“>”):

Example of Leibniz using Pi as an equal sign instead of =

There are overbars to indicate grouping of terms—arguably a better idea than parentheses, though harder to type, and typeset:

Leibniz used overbars to indicate grouping of terms

We do use overbars for roots today. But Leibniz wanted to use them in integrals too. Along with the rather nice “tailed d”, which reminds me of the double-struck “differential d” that we invented for representing integrals in Mathematica.

Showing Leibniz's use of overbars in integrals

Particularly in solving equations, it’s quite common to want to use ±, and it’s always confusing how the grouping is supposed to work, say in a±b±c. Well, Leibniz seems to have found it confusing too, but he invented a notation to handle it—which we actually should consider using today too:

I’m not sure what some of Leibniz’s notation means. Though those overtildes are rather nice-looking:

One example of Leibniz's notation using overtildes

As are these things with dots:

One example of Leibniz's notation using dots

Or this interesting-looking diagrammatic form:

Of course, Leibniz’s most famous notations are his integral sign (long “s” for “summa”) and d, here summarized in the margin for the first time, on November 11th, 1675 (the “5″ in “1675″ was changed to a “3″ after the fact, perhaps by Leibniz):

Leibniz’s most famous notations summarized in the margin for the first time

I find it interesting that despite all his notation for “calculational” operations, Leibniz apparently did not invent similar notation for logical operations. “Or” was just the Latin word vel, “and” was et, and so on. And when he came up with the idea of quantifiers (modern ∀ and ∃), he just represented them by the Latin abbreviations U.A. and P.A.:

Leibniz's notation for logical operations

It’s always struck me as a remarkable anomaly in the history of thought that it took until the 1930s for the idea of universal computation to emerge. And I’ve often wondered if lurking in the writings of Leibniz there might be an early version of universal computation—maybe even a diagram that we could now interpret as a system like a Turing machine. But with more exposure to Leibniz, it’s become clearer to me why that’s probably not the case.

One big piece, I suspect, is that he didn’t take discrete systems quite seriously enough. He referred to results in combinatorics as “self-evident”, presumably because he considered them directly verifiable by methods like arithmetic. And it was only “geometrical”, or continuous, mathematics that he felt needed to have a calculus developed for it. In describing things like properties of curves, Leibniz came up with something like continuous functions. But he never seems to have applied the idea of functions to discrete mathematics—which might for example have led him to think about universal elements for building up functions.

Leibniz recognized the success of his infinitesimal calculus, and was keen to come up with similar “calculi” for other things. And in another “near miss” with universal computation, Leibniz had the idea of encoding logical properties using numbers. He thought about associating every possible attribute of a thing with a different prime number, then characterizing the thing by the product of the primes for its attributes—and then representing logical inference by arithmetic operations. But he only considered static attributes—and never got to an idea like Gödel numbering where operations are also encoded in numbers.

But even though Leibniz did not get to the idea of universal computation, he did understand the notion that computation is in a sense mechanical. And indeed quite early in life he seems to have resolved to build an actual mechanical calculator for doing arithmetic. Perhaps in part it was because he wanted to use it himself (always a good reason to build a piece of technology!). For despite his prowess at algebra and the like, his papers are charmingly full of basic (and sometimes incorrect) school-level arithmetic calculations written out in the margin—and now preserved for posterity:

Example of basic school-level arithmetic calculations written out in the margin by Leibniz

There were scattered examples of mechanical calculators being built in Leibniz’s time, and when he was in Paris, Leibniz no doubt saw the addition calculator that had been built by Blaise Pascal in 1642. But Leibniz resolved to make a “universal” calculator, that could for the first time do all four basic functions of arithmetic with a single machine. And he wanted to give it a simple “user interface”, where one would for example turn a handle one way for multiplication, and the opposite way for division.

In Leibniz’s papers there are all sorts of diagrams about how the machine should work:

Leibniz's diagrams about how an arithmetic machine should work

Leibniz imagined that his calculator would be of great practical utility—and indeed he seems to have hoped that he would be able to turn it into a successful business. But in practice, Leibniz struggled to get the calculator to work at all reliably. For like other mechanical calculators of its time, it was basically a glorified odometer. And just like in Charles Babbage’s machines nearly 200 years later, it was mechanically difficult to make many wheels move at once when a cascade of carries occurred.

Leibniz at first had a wooden prototype of his machine built, intended to handle just 3 or 4 digits. But when he demoed this to people like Robert Hooke during a visit to London in 1673 it didn’t go very well. But he kept on thinking he’d figured everything out—for example in 1679 writing (in French) of the “last correction to the arithmetic machine”:

1679 writing (in French) of the last correction to the arithmetic machine

Notes from 1682 suggest that there were more problems, however:

Notes from 1682 suggesting that there were more problems with the arithmetic machine

But Leibniz had plans drafted up from his notes—and contracted an engineer to build a brass version with more digits:

It’s fun to see Leibniz’s “marketing material” for the machine:

As well as parts of the “manual” (with 365×24 as a “worked example”):

Complete with detailed usage diagrams:

But despite all this effort, problems with the calculator continued. And in fact, for more than 40 years, Leibniz kept on tweaking his calculator—probably altogether spending (in today’s currency) more than a million dollars on it.

So what actually happened to the physical calculator? When I visited Leibniz’s archive, I had to ask. “Well”, my hosts said, “we can show you”. And there in a vault, along with shelves of boxes, was Leibniz’s calculator, looking as good as new in a glass case—here captured by me in a strange juxtaposition of ancient and modern:

All the pieces are there. Including a convenient wooden carrying box. Complete with a cranking handle. And, if it worked right, the ability to do any basic arithmetic operation with a few minutes of cranking:

Leibniz’s calculator with the cranking handle

Leibniz clearly viewed his calculator as a practical project. But he still wanted to generalize from it, for example trying to make a general “logic” to describe geometries of mechanical linkages. And he also thought about the nature of numbers and arithmetic. And was particularly struck by binary numbers.

Bases other than 10 had been used in recreational mathematics for several centuries. But Leibniz latched on to base 2 as having particular significance—and perhaps being a key bridge between philosophy, theology and mathematics. And he was encouraged in this by his realization that binary numbers were at the core of the I Ching, which he’d heard about from missionaries to China, and viewed as related in spirit to his characteristica universalis.

Leibniz worked out that it would be possible to build a calculator based on binary. But he appears to have thought that only base 10 could actually be useful.

It’s strange to read what Leibniz wrote about binary numbers. Some of it is clear and practical—and still seems perfectly modern. But some of it is very 17th century—talking for example about how binary proves that everything can be made from nothing, with 1 being identified with God, and 0 with nothing.

Almost nothing was done with binary for a couple of centuries after Leibniz: in fact, until the rise of digital computing in the last few decades. So when one looks at Leibniz’s papers, his calculations in binary are probably what seem most “out of his time”:

With binary, Leibniz was in a sense seeking the simplest possible underlying structure. And no doubt he was doing something similar when he talked about what he called “monads”. I have to say that I’ve never really understood monads. And usually when I think I almost have, there’s some mention of souls that just throws me completely off.

Still, I’ve always found it tantalizing that Leibniz seemed to conclude that the “best of all possible worlds” is the one “having the greatest variety of phenomena from the smallest number of principles”. And indeed, in the prehistory of my work on A New Kind of Science, when I first started formulating and studying one-dimensional cellular automata in 1981, I considered naming them “polymones”—but at the last minute got cold feet when I got confused again about monads.

There’s always been a certain mystique around Leibniz and his papers. Kurt Gödel—perhaps displaying his paranoia—seemed convinced that Leibniz had discovered great truths that had been suppressed for centuries. But while it is true that Leibniz’s papers were sealed when he died, it was his work on topics like history and genealogy—and the state secrets they might entail—that was the concern.

Leibniz’s papers were unsealed long ago, and after three centuries one might assume that every aspect of them would have been well studied. But the fact is that even after all this time, nobody has actually gone through all of the papers in full detail. It’s not that there are so many of them. Altogether there are only about 200,000 pages—filling perhaps a dozen shelving units (and only a little larger than my own personal archive from just the 1980s). But the problem is the diversity of material. Not only lots of subjects. But also lots of overlapping drafts, notes and letters, with unclear relationships between them.

Leibniz’s archive contains a bewildering array of documents. From the very large:

Very large document from Leibniz's archive

To the very small (Leibniz’s writing got smaller as he got older and more near-sighted):

Very small document from Leibniz's archive

Most of the documents in the archive seem very serious and studious. But despite the high cost of paper in Leibniz’s time, one still finds preserved for posterity the occasional doodle (is that Spinoza, by any chance?):

Documents from the archive with a doodle by Leibniz

Leibniz exchanged mail with hundreds of people—famous and not-so-famous—all over Europe. So now, 300 years later, one can find in his archive “random letters” from the likes of Jacob Bernoulli:

What did Leibniz look like? Here he is, both in an official portrait, and without his rather oversized wig (that was mocked even in his time), that he presumably wore to cover up a large cyst on his head:

As a person, Leibniz seems to have been polite, courtierly and even tempered. In some ways, he may have come across as something of a nerd, expounding at great depth on all manner of topics. He seems to have taken great pains—as he did in his letters—to adapt to whoever he was talking to, emphasizing theology when he was talking to a theologian, and so on. Like quite a few intellectuals of his time, Leibniz never married, though he seems to have been something of a favorite with women at court.

In his career as a courtier, Leibniz was keen to climb the ladder. But not being into hunting or drinking, he never quite fit in with the inner circles of the rulers he worked for. Late in his life, when George I of Hanover became king of England, it would have been natural for Leibniz to join his court. But Leibniz was told that before he could go, he had to start writing up a history project he’d supposedly been working on for 30 years. Had he done so before he died, he might well have gone to England and had a very different kind of interaction with Newton.

At Leibniz’s archive, there are lots of papers, his mechanical calculator, and one more thing: a folding chair that he took with him when he traveled, and that he had suspended in carriages so he could continue to write as the carriage moved:

Folding chair that Leibniz took with him when he traveled

Leibniz was quite concerned about status (he often styled himself “Gottfried von Leibniz”, though nobody quite knew where the “von” came from). And as a form of recognition for his discoveries, he wanted to have a medal created to commemorate binary numbers. He came up with a detailed design, complete with the tag line omnibus ex nihilo ducendis; sufficit unum (“everything can be derived from nothing; all that is needed is 1”). But nobody ever made the medal for him.

In 2007, though, I wanted to come up with a 60th birthday gift for my friend Greg Chaitin, who has been a long-time Leibniz enthusiast. And so I thought: why not actually make Leibniz’s medal? So we did. Though on the back, instead of the picture of a duke that Leibniz proposed, we put a Latin inscription about Greg’s work.

And when I visited the Leibniz archive, I made sure to bring a copy of the medal, so I could finally put a real medal next to Leibniz’s design:

Leibniz’s medal with the original design

It would have been interesting to know what pithy statement Leibniz might have had on his grave. But as it was, when Leibniz died at the age of 70, his political fates were at a low ebb, and no elaborate memorial was constructed. Still, when I was in Hanover, I was keen to see his grave—which turns out to carry just the simple Latin inscription “bones of Leibniz”:

Across town, however, there’s another commemoration of a sort—an outlet store for cookies that carry the name “Leibniz” in his honor:

So what should we make of Leibniz in the end? Had history developed differently, there would probably be a direct line from Leibniz to modern computation. But as it is, much of what Leibniz tried to do stands isolated—to be understood mostly by projecting backward from modern computational thinking to the 17th century.

And with what we know now, it is fairly clear what Leibniz understood, and what he did not. He grasped the concept of having formal, symbolic, representations for a wide range of different kinds of things. And he suspected that there might be universal elements (maybe even just 0 and 1) from which these representations could be built. And he understood that from a formal symbolic representation of knowledge, it should be possible to compute its consequences in mechanical ways—and perhaps create new knowledge by an enumeration of possibilities.

Some of what Leibniz wrote was abstract and philosophical—sometimes maddeningly so. But at some level Leibniz was also quite practical. And he had sufficient technical prowess to often be able to make real progress. His typical approach seems to have been to start by trying to create a formal structure to clarify things—with formal notation if possible. And after that his goal was to create some kind of “calculus” from which conclusions could systematically be drawn.

Realistically he only had true success with this in one specific area: continuous “geometrical” mathematics. It’s a pity he never tried more seriously in discrete mathematics, because I think he might have been able to make progress, and might conceivably even have reached the idea of universal computation. He might well also have ended up starting to enumerate possible systems in the kind of way I have done in the computational universe.

One area where he did try his approach was with law. But in this he was surely far too early, and it is only now—300 years later—that computational law is beginning to seem realistic.

Leibniz also tried thinking about physics. But while he made progress with some specific concepts (like kinetic energy), he never managed to come up with any sort of large-scale “system of the world”, of the kind that Newton in effect did in his Principia.

In some ways, I think Leibniz failed to make more progress because he was trying too hard to be practical, and—like Newton—to decode the operation of actual physics, rather than just looking at related formal structures. For had Leibniz tried to do at least the basic kinds of explorations that I did in A New Kind of Science, I don’t think he would have had any technical difficulty—but I think the history of science could have been very different.

And I have come to realize that when Newton won the PR war against Leibniz over the invention of calculus, it was not just credit that was at stake; it was a way of thinking about science. Newton was in a sense quintessentially practical: he invented tools then showed how these could be used to compute practical results about the physical world. But Leibniz had a broader and more philosophical view, and saw calculus not just as a specific tool in itself, but as an example that should inspire efforts at other kinds of formalization and other kinds of universal tools.

I have often thought that the modern computational way of thinking that I follow is somehow obvious—and somehow an inevitable feature of thinking about things in formal, structured, ways. But it has never been very clear to me whether this apparent obviousness is just the result of modern times, and of our experience with modern practical computer technology. But looking at Leibniz, we get some perspective. And indeed what we see is that some core of modern computational thinking was possible even long before modern times. But the ambient technology and understanding of past centuries put definite limits on how far the thinking could go.

And of course this leads to a sobering question for us today: how much are we failing to realize from the core computational way of thinking because we do not have the ambient technology of the distant future? For me, looking at Leibniz has put this question in sharper focus. And at least one thing seems fairly clear.

In Leibniz’s whole life, he basically saw less than a handful of computers, and all they did was basic arithmetic. Today there are billions of computers in the world, and they do all sorts of things. But in the future there will surely be far far more computers (made easier to create by the Principle of Computational Equivalence). And no doubt we’ll get to the point where basically everything we make will explicitly be made of computers at every level. And the result is that absolutely everything will be programmable, down to atoms. Of course, biology has in a sense already achieved a restricted version of this. But we will be able to do it completely and everywhere.

At some level we can already see that this implies some merger of computational and physical processes. But just how may be as difficult for us to imagine as things like Mathematica and Wolfram|Alpha would have been for Leibniz.

Leibniz died on November 16, 1716. In 2016 that’ll be 300 years ago. And it’ll be a good opportunity to make sure everything we have from Leibniz has finally been gone through—and to celebrate after three centuries how many aspects of Leibniz’s core vision are finally coming to fruition, albeit in ways he could never have imagined.

↧