IPython founder details road map for interactive computing p Source: Paul Krill
IPython, for "interactive Python," has been gathering steam as a mechanism for interactive computing and data analysis and visualization. It features the IPython Notebook, which provides a Web-based computational environment that combines code execution, text, mathematics, plots, and rich media.
The open source IPython project was invented by Fernando Perez while at the University of Colorado in 2001, and a formal version 1.0 was released last year. Perez, who is a now a research scientist in the Brain Imaging Center at the University of California, Berkeley, sat down with InfoWorld Editor at Large Paul Krill at the recent Strata conference in Silicon Valley to talk about IPython, including its genesis, applications, and a road.map for its future.
InfoWorld: What exactly is IPython and what's the main use of it?
Perez: IPython began its life when I was a grad student in physics and I wanted an improved interactive shell over the default interactive Python shell. The Python language provides a shell, but it's limited only to executing the language. I wanted an environment better suited for data analysis as a scientist -- something that would let me interact more easily with my files and with data visualization libraries.
I developed IPython as a superset of the default Python shell and over the years colleagues joined me. It became increasingly popular and widely used in the scientific Python community. And as the development team grew into a strong open source team, we were able to expand that idea of an interactive shell into a more complex system that allows us still to interact with code, execute code, and see the results, not only to do it in a terminal but also through a Web browser.
We've abstracted that idea of interacting in a terminal into a Web browser-based system, which is what we call the IPython Notebook that provides the same experience but allows you to combine not only code but also text, images, results, and effectively build computational documents. One way to think about it is like a word processor, where you're not only typing text, you can also type code, which is executed right inside of the document and whose results are kept right in there so you can build a combination of human language narrative with computation with results and with visualization. This is done through the Web browser.
InfoWorld: Is IPython a breakthrough in data analysis?
Perez: IPython is a tool that builds on the strengths of the Python community and the scientific Python toolkits for data analysis. The breakthrough, if there is any, should be judged by the user community. But what we are trying to contribute is having a tool that provides a very fluid experience, so when scientists are working with their data and they are trying to understand a problem, they are as close to the data and the results and experience as possible, with as few barriers between the code they're trying to type and the [obtaining] of results. That then allows them to communicate whatever insights they obtain with others.
These notebook documents, once saved, they're files that can be emailed to others, they can be converted to Web pages, and they can actually be shared online. We provide a service that renders notebooks into Web pages so that people can share their insights without asking the recipients to install anything. It's called the notebook viewer, and it lets anyone share a link with others and they can read their data analysis, including viewing their graphics and the results and whatnot without having to install anything.
InfoWorld: At what stage of development is IPython?
Perez: We are about to release [version] 2.0 in a few weeks. We have a number of very interesting and valuable features coming up, including a very sophisticated framework for interactive JavaScript widgets, for browser-based rich interactivity that can have connections back to a computational backend. That's about to be released. We've [also] made lots of improvements to the notebook and usability.
InfoWorld: What does IPython 2.0 bring to JavaScript developers?
Perez: It means that it becomes very easy for users in any programming language, and I want to reiterate how IPython these days is language-agnostic and how these tools can be used in other programming languages. It allows people who are developing code in other programming languages to produce computational output that immediately couples the JavaScript visualization. For example, we're seeing people writing D3 visualization where you can execute Python code, but immediately in the browser a D3-based rich visualization pops up. What it will do is it will expand the reach of what Web developers are doing, better coupling them to the community that traditionally use languages like Python and R to do their analytics by building a bridge between the traditional programming languages communities and the Web developers. The Web developers normally only code in JavaScript, but the people who are running Python or R tend not to be Web developers. This will help build a bridge between those two communities.
InfoWorld: What other features are in IPython 2.0?
Perez: We've made a lot of improvements to the usability of the notebook in that we've solved the major limitation it had that required you to start many notebook servers and many different parts of your file system. It seems like a simple thing, but the ability for a single notebook server to navigate your entire file system is actually a huge usability win. There've been improvements to the underlying machinery that converts notebooks to other formats, so if you want to convert a notebook into a blog post into a Web page, into a PDF for a paper or for a book or to print a report, that machinery has also seen a lot of improvements.
InfoWorld: Is there anything else you want to say about the IPython?
Perez: [For] IPython 3.0 we also have very ambitious plans. That will be released hopefully in the summer. What we will do is we will expand the current notebook server from being something that is tied to a single user into something that can be run for multiple users, so you can run a notebook server on a multi-user server, for example, and when users log in they will automatically get their own notebook machinery. The idea of deploying a notebook server in a research group, in a data analytics team, for a class at a university -- this is a feature that we have enormous demand for. We've already done all the design work for it. The plan is to implement it over the course of the next six months or so.
InfoWorld: I don't know if you want to talk about 4.0, 5.0, whatever?
Perez: We know that the multi-user server is going to be a very complex piece of work and we expect 3.0 to be just an early release. And 4.0 is going to be continuing to improve on that.
InfoWorld: When is 4.0 due?
Perez: Probably early 2015.
InfoWorld: What's the main link between IPython and big data?
Perez: There are several. On one hand, this interactive analysis environment is already useful to work on data, even if it's small data on a laptop. The fact that it runs through the Web browser is very important because it means that the actual processing engine can be running on a remote node. This morning we saw a talk about how to do machine learning using IPython on a cluster, which allows you to put the IPython engine somewhere the big machines or the cloud data storage is, then interact with it from a laptop through a Web browser. Finally, IPython actually has a subcomponent of the project called IPython.parallel that allows you to do the same kind of interactive fluid experience, but not just with one node but with "n" nodes, with an entire parallel cluster. This morning's talk about scikit-learn and IPython.parallel was demonstrating precisely how to do interactive parallel machine learning in the cloud.
InfoWorld: What are the commercial or enterprise business opportunities for IPython?
Perez: We're seeing a number of adopters, and in this conference we've learned of a few more. There's some we knew about; a company in Austin called Enthought has been a longtime supporter of the project, and it ships a [product] called Canopy that includes IPython. Another company -- also in Austin -- called Continuum Analytics, also provides an online version of the notebook called Wakari. We're seeing a number of companies using it even in public-facing services. We've also had conversations with many companies who have come and told us, "Oh, inside of our data analytics group, it's IPython everywhere."
InfoWorld: Are you looking to monetize IPython personally or form a company around it? Or are you leaving it as an academic project?
Perez: Currently it's being operated as an academic project and we have funding from the Alfred P. Sloan Foundation. We also have funding from the Simons Foundation and the National Science Foundation. But we've also received last year a donation from Microsoft Research that has helped us tremendously. And we are actively looking to partner with industries in what we think can be a very productive model for funding an open source academic effort that has great impact in industry.
InfoWorld: Is Microsoft planning to use IPython in any of its products?
Perez: They ship IPython as part of the Python tools for Visual Studio plug-in.
This story, "IPython founder details road map for interactive computing platform," was originally published at InfoWorld.com. Get the first word on what the important tech news really means with the InfoWorld Tech Watch blog. For the latest developments in business technology news, follow InfoWorld.com on Twitter.
| }
|