1 >> Kristin Lauter: William Stein from UW here to speak to us about Sage. William is a professor in the math department, professor of number theory, and has written several books on number theory. He's also the founder and main developer for the Sage software project. Thank you. >> William Stein: Thank you. So today, I'll talk about Sage. I'm going to describe briefly the background to Sage, kind of where it came from and some of the advantages and motivation of Sage. And then I'll just show you a couple of examples. So that will be my talk. First, a quick poll. How many of you use mathematical software, like Matlab or PARI or Maple? Wow, almost every person in the audience, okay. So here's a quick history of Sage. I personally use mathematical software a lot, and I started using software a lot when I was a Ph.D. student at Berkeley in about 1997, and I contributed a lot to Magma, which is a closed source, but very, very powerful computer algebra system for doing kind of cryptography, number theory, algebra group theory, that sort of thing. It's a lot better than some of the more famous systems, at least for those areas of mathematics. But in 2005, I started a new project called Sage, which mainly has, as its goal, to be technically -- basically technically more modern than a lot of those other systems, and it's also free, which is a big advantage. The funding model is actually very similar to the funding for Magma. It's just that for Magma what they do is they charge -- they have government grants and they charge users. And with Sage, we just have government grants and maybe get some donations from users. But it's a similar funding model. Magma's been funded for decades that way with no trouble. I think Sage can be as well. And I think it's unnecessary to charge users. And so far, that's proven to be the case. So that's one of the kind of fundamental design constraints. One of the advantages of being free is we get kind of a bigger choice of preexisting libraries to draw on, to choose from in building Sage. We get to choose anything that's out there that's free. And that's quite a lot of good stuff. Sage 1.0 was released almost -- I guess to this day, I think it would be three years ago. Maybe yesterday three years ago. But it's basically the three-year 2 anniversary of Sage. Or birthday, not anniversary. I didn't marry Sage. So there have been a bunch of Sage Days workshops. There are where a lot of Sage development work gets done. There are Sage Days 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. And then Sage Days 12, which I'm wearing the t-shirt for. The theme of Sage Days 12, which was two weeks ago in San Diego was to fix as many bugs in Sage as we could. So about 18 of us spent around the clock a week fixing bug after bug after bug after bug. I personally fixed, I think, 25 bugs. So turns out that if you take the amount the conference cost the funder of the conference and divide by -- well, take that divided by the number of bugs, it cost about $60 per bug. So it's pretty efficient for Sage development. But that's how much it costs to fix a bug. Also, the most productive Sage developers working as hard as they can for basically full-time for about five days can fix between 20 and 25 bugs. So that's not so many. You'd think that -- we all thought we would fix twice as many as we did when we got going. So it's surprising how long it takes to fix a bug. Yes? >> How long is the current bug list? >> William Stein: There are, let's see. I have a nice little document about this which I will just pull up. This is kind of silly, but -- let me just show you a little thing. That's actually the wrong thing. Open report.pdf. So this is just a summary of what happened. So those are the bugs in Sage. These are the bugs we fixed at the workshop that we know of. So more precisely, there were 621 known defects in Sage when we started, and we fixed over 200 of them. And these are all listed on a public bug tracker that you can go to. I'll show it to you. So right here, this is the Sage bug tracker, and here you can list, for example, active tickets. Some of them are enhancements, some of them are bugs, et cetera. And this lists 995. These are not all defects. Many of these things are add a new features to Sage, et cetera. You can see each of the known bugs. And a lot of the kind of really -- there are some bugs which are, you know, very, very fine corner cases that, you know, you really, they get reported, you'd like to fix them, but they're not a big deal. And there are some that are really a big deal. Of course, we aimed at fixing as many of the big deal bugs as we could during Sage Days. So I think the next Sage release, Sage 3.3, is going to be very, very good from the 3 point of view of having less bugs than any other Sage. So that's what we did at the last Sage Days, which was in San Diego. And that was actually -- the entire conference was funded by the Department of Defense, because they find Sage very useful for cryptography research so they really want Sage to thrive, and they're funding its development. So we have a lot of workshops, and this really got developer energy going. I think quite a lot of developer energy got going already at Sage Days 1, which Kristin was at, which was in San Diego three years ago, and it's sort of just grown ever since. We also won the first prize in some international contest for software. So in the scientific category, Sage won first prize. And we've got funding from various organizations for work on Sage. For example, last summer, Google funded Robert Miller, who is sitting here to work all summer on what ->> [inaudible]. >> William Stein: On partition refinement functions so common [unintelligible] type code. Sage has a really large amount of highly optimized codes for working with graphs and partitions which Robert wrote over the course of about two or three years. And this is new code that often is not available anywhere else. Here's what I view Microsoft's interest in Sage as. Definitely, I'm going to tell you what Sage is in a minute. But I just want to kind of give you kind of this overview. So I think that my impression is that Microsoft's interest in Sage is that, one, it could be a useful tool to support research in cryptography and related areas such as [unintelligible] graph theory, linear algebra, signal processing and so on. So, you know, maybe Kristin wants to compute something, and it's difficult to do in other systems, and it can be made easy to do in Sage so she uses Sage to do it. Sage has a lot of code that's really, really useful for doing cryptography research. Another reason Sage is interesting to Microsoft in enhancing the viability of Windows as a platform for doing high performance computing. So I mean, we're of an effort by another part of Microsoft to fund R development. So R being the stats, the very, very famous stats package. And the reason being that R currently, there's no 64-bit version of R for Windows. It's kind of a second class citizen in Windows. There's 32 bit only. It builds on NGW instead of MSVC, et cetera. It's not fully supported 4 under Windows. It's bad from an HPC point of view. You know, if you want to sell customers on buying a big HPC system, and installing Windows on it, if, you know, the main software they use happens to be, say, R, and maybe they want to do some crypto research that involves some Sage, it's kind of bad if Sage or R doesn't work on Windows or doesn't work very, very well, because the customer might have to use Linux on their cluster, which would be undesirable that you would restrict their choice. So another reason that I think Microsoft would have an interest in Sage is that it would be good if Sage ran very well on Microsoft's operating system, especially 64-bit Sage. And as you'll see in a minute, Sage itself -- it's kind of like a distribution of most of the open source mass software out there. So if you make Sage run well on a platform, you're making pretty much all the open source mass software out there run well on that platform. So Sage includes [unintelligible] PARI, Singular, Maxima, et cetera. It includes across the board almost all the open source mass software out there. And these are the two reasons that Microsoft is funding Sage development. So that it will be a better tool for cryptography and other research and so that Sage will be much more well supported on Windows. Okay. Now I'm just going to give you a little glimpse into the Sage community. Just to emphasize, it's a free program. I've mentioned that several times now. There's also a free web application. Anyone can go to a certain website, sagenb.org. And I'll do that right now, and they can use Sage. There's also a Sage support mailing list with 964 members. So I'm going to switch to the machine that's physically located in here, so it should switch over. And I'm running Internet Explorer, and I go to this website, sagenb.org. Then I log in using an account I made. You can just click create a new account and then you'd have your own account. I'll just log in. And then I get a list of worksheets, and I can create a new one. I just made a brand new one right here, actually. And then I can do calculations in Sage. I can do arithmetic, two plus three. Can also draw plots. So plot, say, sign of X cubed times cosign of X from 0 to 5. And you get a nice plot. So that's just showing you how Sage works and how you can use it from just some random computer somewhere 5 by just going to this website. I'll switch back to the presentation. Yes? >> Is this running locally? >> William Stein: No. When I went to that website right here, I'll switch back for a second. That's running on sagenb.org, which is a computer in the mathematics department. In fact, sagenb.org is running on this hardware. So that right there, that's Internet Explorer running on this computer right here accessing a web page which is at University of Washington. >> [inaudible]. >> William Stein: The only thing that I downloaded was the Javascript that defines that web page. It's just a web ap, an Ajax web ap. Here's the compute their it's actually running on. This is a computer that's sitting in the basement of the mathematics department. And it's -- this is the machine that we do most of our Sage development on. It's four Sunfire X4450s, each with 128 gigabytes of memory and 24 zeon cores, and then a big disk box here. This what I'm showing you here is also Sage. It looks like a presentation. It's actually Sage, and this is running locally on my laptop. So when I do a calculation here, this is happening completely locally, just on my laptop. This is not being served over the web. But you can't tell the difference, except that it might be a little faster, or maybe a little slower, depending on -- I mean, that hardware I just showed you is faster than my laptop. So if you type some -- the time to do the actual CPU bound work is faster if I do it remotely. Whereas the time to maybe, you know, get the answer to display is faster if I do it locally. So maybe this is a little snappier than if I do it on the other one. There's a mailing list for Sage called Sage Support, and it has, last time I checked, it had 964 members. And I'm pulling it up right here. And now it has, whoa, wait a minute. Did people drop out? 955. Hm. That's odd. Wow. Huh. Anyways, now it has 955 members. Maybe several were discovered to be spammers or something. I don't know. Here's the Sage website. So apparently can show you it directly, since I have 6 Internet access. So I'll just go to sagemath.org. This is the Sage website, and there's a big download link that you can click on. This link gives you the Sage notebook that I just showed you. There is -- if you click on quick start, it gives you just a quick summary of what Sage is about, what it can do. Kind of gives you a little tour. And there's extensive help. If you want to know whether Sage can do something, you can just type a question in here, and it does a search of all the Sage documentation, the Sage users groups and so on. In fact, somebody ask can Sage do something. Just somebody. >> [inaudible]. >> William Stein: Do what? >> Anti-commuting variables. >> William Stein: I'm going to write non-commuting variables. See what comes up. Google found nothing. Google found 12 results. So I guess the web found something. There was nothing in any of the Sage documentation, at least with non dash commuting. Maybe you should just write noncommuting without the dash. Let's see. It's giving nothing at all. Hm. I know that Sage can do things with -- noncommutative. I'll just search for noncommutative. Ah. So quaternion algebras. So that's an example. Matrix bases, quaternion algebras. Working with matrices over noncommutative rings, et cetera. For example, if we click here on quaternion algebras, we'll get the section of the Sage reference manual, quaternion algebras and you can see an example of making an quaternion algebra. And you can paste this example into Sage, click a box right here, and now we have a quaternion algebra. And see, I times J is K, that sort of thing. If I define it this way, then the variables are directly defined. So I can do I times J plus K, get two times K, et cetera. So that is the Sage website. Another neat thing that is on the Sage website is a development map. There's acknowledgment. So this lists organizations that have funded Sage. You can see there are a lot. There's a map which lists the people involved with Sage development, which is getting really, really large. So that's a list of people that have contributed code that's got into Sage. And I think now, the number's well over 100. So a lot of different people have contributed. In fact, one of the motivations for me when I started Sage was a conversation I had with a graduate student at University of Texas at Austin, where he asked if I come 7 up with a nice algorithm, and implement it, what do I implement it in? If I implement it in Magma, the people in Uruguay, where I'm from, are not going be able to use it, because Magma costs $550 at the third world discount, which is six months' salary there or something. If I implement it in PARI, only the PARI people. What system should I implement my code in? And Sage is an answer to that question, and a lot of people are agreeing with that answer, as you can see by the list of people developing code. There are a lot of people here I've never met who have -- who are actively including code in Sage. There's -- for example, like a week or two ago, there was some random guy from France who I've never met who popped up and started implementing very fast codes for computing discrete logs in generic groups. He just started implementing the standard algorithms and doing a very good job at it. That's really good. It just sort of pops up, appears. You post the code. We referee it. I can actually probably find the code right here. So I think he implemented, for example -- it's not the -- it's in small groups. So he's not implementing index calculus attacks on discrete logs. But it's this guy, YLCHAPUY. He just popped up, posted this code. Here it is. And it's a nice little implementation. This is just one of several things he's posting. And what happened was he posted the code, he described it on the developer mailing list, then he posted the code, then I asked him to, you know, explain is it faster. And here's doing some simple discrete log in the finite field F15 in Sage 3.23, one of the, I guess, the latest release version of Sage. That one calculation takes 276 seconds. And now, after applying his patch, it takes 0.14 seconds. So that's a lot better. So it's really nice that people just notice these sorts of things and just start speeding them up left and right. >> [inaudible]. >> William Stein: I don't think it was using [unintelligible] at all. It was -- I think it was using baby step giant step, yeah. So I mean, we have some very generic code. First, it was probably making a -- doing a four-loop. And then after, that somebody implemented baby step giant step. And after that, somebody came along and does this. I'm sure he's doing this because for his research, it's very important that it be fast. Yes? >> Are there any concerns about people pulling operating code from somewhere else 8 or the algorithms might be patented or something like that? >> William Stein: We have concerns, definitely. One thing we do is make sure people copyright their code so they're at fault. And they have to release it to us under a license that allows to us include it in Sage. That could be GPLV2 plus. It could be a BSD license or it could be the most permissive Microsoft shared license, which is GPLV2 plus compatible. That so would be a possibility also. Nobody's contributed a code under that also. But my understanding is it's possible for you to contribute code under one of Microsoft's licenses. And there is one that would work. That's one thing we do. The other thing is every single line of code that gets into Sage is peer reviewed. That helps a lot. For example, this particular patch, I reviewed it. And if you look, let's see, John Cremona also reviewed it, and he wrote all this stuff about what the heck is going on. And then looks like Michael pointed out that I'd already reviewed it positively. So -- but John Cremona is a pretty famous number theorist. He's the chief editor of the London Mass Society. The Journal of the London Mass Society. He's a pretty serious guy. He's also a very involved Sage developer and he very regularly referees patches for the Sage group. So you can see that here. And now it got merged into Sage. So it just gives you a glimpse into how Sage development works. Lots of people all over the world do this. But it's totally possible that somebody could sneak some patented code in. We would know exactly who did it. We would be able to remove it if somebody complained, but it's totally possible it could happen. We don't have lawyers sort of, you know, carefully looking over stuff like you guys do. But at least we would know how the code got there, and we could get rid of it. Yes? >> When you suspect that, do you link the code to the DLL, or do you link it directly? If it's DLL [unintelligible], if it's directly, you have to relink it. >> William Stein: All code we get comes in as source code. It's sort of like this code right here literally modifies. This is a patch that modifies our core library. When we get code submissions from other people, they're modifying how Sage is written itself. I think that answers your question. It would be difficult if it turns out that algorithm were patented and we had to remove it or we had to get permission from the patent owner, whatever, it would be difficult because we'd have to revert the patch, which would make everything slow 9 again, maybe some other patches people had built on top of it. It could be painful, yes. It hasn't happened yet to us. But if it does, I think we'll know exactly where it came from, and we'll listen to whoever is complaining and then we'll fix it. That's the plan. >> And you're noncommercial, which makes it easier. >> William Stein: Yeah, we make zero money. Sage is, there's no company. I want to make that clear. There's no company at all. All money that we get for Sage goes into on an account at university of Washington. If anybody gives money, it's a tax deductible contribution. It's all a not for profit type thing. So we tend -- you know, if you want 50% of our money that we make off of Sage, you're welcome to it, because it's 50% of zero. All right. Here are some advantages of Sage over Mathematica, Matlab, Maple. There are certainly many -- there are advantages and disadvantages to Sage as compared to Matlab and Mathematica and so on. One advantage to Sage, the core language, the language used for interact is Python, which is a general purpose language. It's easily one of the world's top ten most popular languages. There's definitely some strong interest in Python at Microsoft, for example. There's the Iron Python project, which is an implementation of Python on top of dot net framework. So I think Jim Hugen or something is the guy who works here who works on Iron Python. And it will definitely be worth thinking about. As Sage gets better and better support under MSVC and under Windows, it will be interesting to see how it can work better with Iron Python. We'll see how that goes Another advantage of Sage instead of using a custom language like Matlab's very mathematically oriented language or Maple's very odd and mathematically oriented language, it's just a normal, general purpose language. Python is a good language at that. It's nice to use, friendly. People pick it up really quickly in practice. A second big advantage of Sage is it's really easy to write compiled code in Sage that's just as fast as custom C code. That's not easy to do in Matlab, Magma or Maple. It is easy to do in Sage. I'll show you some examples of that. That's a huge, huge advantage of Sage. People regularly post saying why doesn't Magma do something like this that, why doesn't Maple do this? It's really something that partly comes for free in Python. It's a pretty neat technology that allows us to do this, called Cython. I'll show you an example of 10 that in a minute. There's a lot of cool stuff that's just included in Sage, or available to Sage because we use a general purpose language, because we use Python, and Python has, you know, millions of users around the world so it has a lot more users than these other systems and there's a lot of code that's at PARI completely unrelated to mathematics that you can just use from Sage. For example, say you want to write a web server in Sage. Well, it's a few lines of code because there already exists a library called twisted that provides a web server. Let's say you want to write a command e-mail. This would be a Spam lover's dream. There's a command in Sage e-mail that sends an e-mail mental. It doesn't require you have a mail server set up anywhere. In fact, Sage contains a very serious implementation of mail server itself. Sage is a mail server because the twisted library in Sage is a mail server. So for example -- good thing this isn't tested because every time you test Sage, it would send an e-mail. But if I do this, you know, you could have some calculation you're running. You want to e-mail it to yourself to tell you that the calculation is done, it's just a command in Sage to do that. It, in fact, immediately returns, it starts a trial process. That trial process sits there and it sends an e-mail when it's capable of connecting to the mail server. So if I were to go check my e-mail, I would find a message there saying the calculation is finished. I guess I can do that. Hopefully, I don't have any embarrassing e-mails. We'll soon find out. So there it is. The calculation is finished. So it's nice that you can e-mail yourself like that. That's the sort of thing that you could accomplish that in some of the other MAs, but it wouldn't be nearly as nice. I mean, if you wanted to do it in one of the other systems, you'd have to have some other program like mail installed on your machine and have to go through that, whereas here it's a command that's built in, that will work on any copy of Sage anywhere, just for example. But there's a ton of stuff like this where ->> [inaudible]. >> William Stein: That was left over from -- I'm going to refresh this. That's left over from this example right here. So this is Python -- the very first example in the Python tutorial is right here. And in the Python tutorial, most examples are taken from Monty Python's Flying Circus. Python is named after that show rather 11 than the snake, in case you're curious. So here's some just -- I'm just going to show you a bunch of examples of Sage now. The examples will range from a bunch of calculus examples to some examples you see in Cython to speed up real world code to some number plotting examples, 2D and 3D plotting examples. Here's an example where I create two symbolic variables and now I can type in expressions. They involve these symbolic variables, like X to the volume of Y, pi squared of two. They're all exact object. It has a similar feel of Maple, Mathematica or Maxima. By default, you see the output as just a linear line, but if you type show us something, you'll see a nicely typed set. It's actually very nicely typed set. Here, let me zoom in on that for are a second. Actually, I can probably zoom with that, although the resolution is low. But I think I'll zoom in just to emphasize that is not an image. It actually types up mathematics. I'm going to zoom in by increasing the font size of my browser. You see that it actually got bigger. Each of those characters are individual characters, as you can see by highlighting them. If you double click, you see the [unintelligible] that defines it. What I have here is GS math. GS math is a Javascript implementation of text layout engine. This guy, David Carbone, sat down with the tech book and implemented text algorithm, line by line in Javascript. It's a few thousand lines of Javascript, but it's pretty cool because it's nicely integrated into the Sage notebook. What it gives you is you can type in essentially any tech expression or any Sage expression. Sage expressions know how to tech themselves, and you'll get beautifully type set exactly like you want, because you like tech display. So that's what this is. I mean, I assume you like tech. Here's another example where just a little more complicated, expanding out A cubed. Here's an example of solving an equation. So this is a cubic equation. So you can write down an exact solution. You can, you know, of course, change the forms so maybe I want it to be 17 time -- how about squared of 17 times AX plus B. What happens here, secretly, behind the scenes currently is that this equation gets converted into equation in Maxima. Maxima solves it. Then we parse the result and that's what's here. It's probably the case that Maxima will not be used to do all of these sorts of solve 12 operations in the long run. I think that will last for a certain amount of time. We're currently, currently a lot of the symbolic, just the pure symbolic manipulation stuff, the very calculus oriented stuff is implemented using Maxima. We've been doing a lot of work to move away from that. We have a native C plus plus Library called Pynac, which if you make your variables and just give this option, new symbolics equals 1, then you get these Pynac variables, which they work almost the same as the current symbolic variables, but manipulation is way faster. For example, if I expand out A plus B plus C to the power of, say, 20, using Pynac, it takes that long. And I think Maxima will take longer. So if I do exactly the same thing now using -- it's about, well, so this was really -- the first one was really point 0.01, because it didn't use -- everything happened at CPU, the other is 0.07. You can see there's a definite difference in speed. The speed difference will become bigger if you do much large things. Our interface to Maxima is optimized but we can get a lot more out of something that native C plus plus. The other nice thing about the new native C plus plus stuff that we have is you can do symbolic manipulation under finite fields and other funny base rings, which is pretty amusing. All right. Here's an example of a huge integer determinant. I'm making a random 200 by 200 matrix. Every single entry has 128 bits. So just to show you what the entries of the matrix look like. Here's the top left entry. So each number in the matrix is about that big. So it's a big 200 by 200 matrix. Each entry is about that big. And I compute the determinant, which is a very large number. And you can see it took three and a half seconds, less than three and a half seconds. Here's the actual determinant of that matrix. It's really, really big. Sage is very, very good at computing determinants in matrices with large numbers. And this example illustrates making a random matrix and then how you compute the determinant. You call the method determinant on the object that's the matrix. If you want to see what other things you can do with a matrix, if you do A dot and hit the tab key -- so, let's see. How shall I do this? I guess if you pretend like you're ready to call it and you don't know what to do. You're like, I want to call it A dot something, but I don't know. You hit the tab key and it will show you all the options. Like playing a game and it shows you all the directions you can go in. Those are all the things you can do with the matrix. You could compute the determinant. You can differentiate all the entries, get zero here, ask if it's a square matrix. Compute the left kernel of the matrix. Compute [unintelligible] 13 vectors and so on. Triple L reduce the matrix. Triple L reduce the lattice with a given gram matrix, et cetera, in the BKZ algorithm. So you get a big list of possibilities. Then if you decide oh, I want to do determinant, then if you're about to call it and you hit the tab key again, it will tell you about the function you're about to call. So here, you find out that this will call the determinant function. There's a couple of different algorithms you can use. For example, NTL has a pretty good implementation for a certain range of inputs. There's also a [unintelligible] algorithm or you could use LinBox and so on. If you look at almost any function in Sage -- well, actually 65% of the functions in Sage, you will find lots of examples, like this, which illustrate usage of the function. And the examples are automatically tested on a regular basis. Before we make a Sage release, we test every single of these 80,000 inputs on about, I think, 20 different operating system hardware combinations. So, for example, on one of those machines that I showed you before when I showed you that big rack of computers, one of them is running VM ware server which has 12 different Linux distributions installed into it. So we just build Sage in all of them from scratch, run the full test suite, and we do that before we release any copy of Sage. And then we also have some other hardware in other places. For example, we have a build farm at the DOD that allows us to build Sage on all the machines that are of interest to them. Like [unintelligible] Linux, all these sort of exotic machines before we release Sage. So again, there's the determinant of this matrix. about three seconds. Pretty big number. It takes Here's an example of how to do the same calculation using Maple. The nice thing is you can call Maple directly from Sage. This is a kind of unique functionality Sage has that is different than what you get in a lot of other systems. You can call -- like the command Maple of A right there, that takes the matrix A that we have and converts it into a Maple matrix and gives us back B. B is a reference to a matrix in a running session of Maple. There's one session of Maple that got started running. It's sitting there, and inside of that session, there's now a matrix B that is equal to our matrix A. And 14 then by calling determinant, which is a Maple function, it computes the determinant of that matrix. In Maple 12, there's some much faster determinant code than in Maple 11 or earlier, which would have taken hours for this. And it will do the determinant quite quickly. Yes? >> If you haven't paid for Maple, would this a run time error or compile time error? >> William Stein: Run time. >> Run time? >> William Stein: Yes. You'll see at run time a message you should install Maple, and where you can get Maple from, like the Maple website in case you've never heard of Maple. In fact, how about if we switch over to -- remember, I was running this on any public server anyone can log into. I don't want to get in trouble. Maple better not work here. If it does, Maple is going to sue me for violating my license agreement. So here's what you get. If you try to run Maple on a machine that doesn't have Maple, it says unable to run Maple. It gives kind of a teeny piece of a trace back. If you click to the left, it gives you a lot more. And then it gives you some hints about how to set up Maple on your computer, and so on. Tells you where to buy Maple. Okay? So that's what happens. It happens at run time so you don't have to worry -- there's no compile time linking to Maple, which is a serious worry. How do we talk to Maple without actually linking it in somehow, but we use a pseudo TTY so we can talk to Maple. >> If it's a run time error in a complicated program, you never know whether you're going to get a given [unintelligible] message after midnight. >> William Stein: Yeah. Fortunately -- so I showed you the 150 Sage developers, and there are many thousands of Sage use. Most of them use Sage because they don't have Maple or Mathematica or these other systems or they don't want to use them. So the -- and it could have -- at PARI, you have no idea if you're not a Sage developer, the direction Sage development has gone. Here's the direction. As much as possible -- actually, I would say 100%, if you call a Sage command and it doesn't explicitly say right in the name of the command Maple, or algorithm equals Maple as a non-default option, it's not going to use Maple or Mathematica or Magma or any 15 of the other systems. library. There's maybe like three exceptions to that in the whole Sage So as it turns out, that's just not going to happen to you unless you explicitly wrote code that explicitly calls Maple, it's not going to be calling Maple. At PARI, you could have imagined maybe Sage is this big system that assumes you have Maple and Mathematica and actually uses it all over the place to implement stuff. That's not how it turned out to be, though. Really, a lot of people are using Sage so they don't have to want to use Maple at all. It's just if you're working on Sage and you want to see the speeds of answers, or you're curious, did we get right answer? Did we get the same answer as Maple? Right there I can check to see if the two determinant are the same. The algorithms actually are totally different between Maple and Sage in computing the determinant. I asked a Maple developer what was going on. He said it was using a Chinese [unintelligible] theorem algorithm. So it's computing the determinant [unintelligible] using the Chinese [unintelligible] theorem. Sage is using a different algorithm. And it's nice that we get the same answer. So was there another question? >> For example, before, when you gave the quaternion algebra example that was written by David Cole, that was not calling Magma? >> William Stein: No, that was completely 100% Sage code. It was written by David Cole and other people have worked on it too, yep. Unless it's really, really obvious that it is calling Magma, it's not calling Magma. So here's the thing. The stuff that Sage calls by default to get the work done is all included in Sage. It's -- Sage calls PARI and Gap and Singular. It calls these systems all over the place. Like if you, let's say, let's say you want to, say you want to make a multivariant polynomial over -- you want to try, say, the Fateman benchmark, which is a raise a certain variant polynomial to a big power. Actually, it's a little different than that. I think it's -- I'm not even sure what it is. Here's a benchmark. That's not using Magma. It's, I think that's actually the actual benchmark is 20. So this computed the product of two multivariant polynomials. The output 216,000 characters if you print it out. You can see it's very, very fast. Behind the screens it used Singular, it used a C library interface to Singular that Martin Albrecht one of the 16 Sage developers wrote. Singular is included in Sage. There's no requirement you have Singular on your system anywhere. When you get the Sage distribution, it gives you everything that it uses by default. You don't have to worry about this stuff. It's here as an a extra feature. Just to emphasize the usability of this feature, here's a quote from IRC last night. I need to shrink -- actually, that's a little hard to see. So this was some guy, and he wrote I can show people at work tomorrow this they don't have to abandon Matlab if they switch to Sage. He was talking about using the Sage and the interfaces to Matlab and other systems. He was wondering about the sort of conversions you can do between Sage and these other systems. I have no idea where this guy works, I don't know what country he lives in, but he is very excited about Sage, wants his coworkers at work to use it and they have a lot of Matlab code. You know, instead of his sell being hey guys, we have to switch from Matlab to Sage, it's, hey, let's use Sage in addition to Matlab, because Sage has nice cool features that Matlab doesn't have. Sure, let's continue using Matlab. We have licenses. Matlab is really a powerful, wonderful system and we can do stuff in Sage and Matlab at the same time. Okay. Any other questions? Okay. Here's another example of -- I'm going to show you some plotting examples at this point. I'm going to show you plotting examples, number three examples and then my talk will be over. Here's an Pell of making a callable symbolic expression a sign of 3x times X plus log X plus 1 over X plus 1 squared. There you can see it. You can do lots of things like this, like you can integrate it, F dot integrate. There's the symbolic integral. You can plot it. And there's what the plot looks like. The plot command has a calling, like the options that you can give it, what they're named, how they work are very similar to Mathematica. It's like thickness, everything is almost the same as Mathematica, except you use lower case instead of upper case. So it's lower case with underscores. That's just a convention in Python. In Mathematica, it's camel case. You just change all that. There's also a Matlab so if you like Matlab's plotting instead of Mathematica's plotting. And for visualizing data, Matlab's plotting syntax is actually in many ways a lot better than Mathematica's, whereas for messing around with mathematical functions, like in a calculus class, Mathematica's syntax and design is I think is 17 personally better. Here's an example where I plot the same function but instead, if I import pi lab. That pi lab thing, a python thing, if I do pi lab as P, I get this object P, that has a whole bunch of functions, I can do P dot tab to see what they are. It gives you literally everything that you'll see in Matlab for 2D plotting. With the same inputs. It's like a compatibility layer, almost. So for example, the plot command, if you do that you can see how it works, and it's just like Matlab. You give the X and Y values you want to plot, and it plots them. It has these little like line styles and they're just like the Matlab line styles and so on. So you set up a plot by just calling plot multiple times, saying you want axis in this point, saying you want a legend, deciding all these things, using all these options. This guy, John Hunter for Python, he went through and implemented everything exactly like in Matlab for 2D plotting, and that's what you're looking at here. Maybe not exactly like it, but it's very, very similar. And here it is. the same plot. So this looks just like what you'd see coming out of Matlab. It's Here's another cool thing in Sage. You can do -- oh, I think the X is all messed up. Let me make sure. Ah, there it is, okay. So if I call fast float, then I get a version of this function such that when you call it, it's really ridiculously fast. It only takes 518 nanoseconds to call it with a given floating point input. So it's really, really fast. It's actually way faster than just defining a function in Python and calling that function. Yes. >> [unintelligible] log in 500 nanoseconds? >> William Stein: Yeah. In the interpreter. In fact, it does the entire thing in that amount of time. This guy, Robert Bradshaw, is one of my grad students at UW. He wrote something which converts an arbitrary symbolic expression into this call stack, and it's very, very tightly coded. It's a -- I mean, the actual implementation is compiled. It's written in Cython, and but this is all at run time. This is done at run time, and it gives you back something that you can then call. And this was really important because we were doing things like contour plots of functions, 3D plots, like 3D, you know, give a function. You want a plot of a 18 function you would have to evaluate on a mesh. It was ridiculously slow. Literally, every time it would evaluate one of these functions it would call up to Maxima. Maxima would plug numbers in, simplify the result, and we would parse that. You'd try to plot a function in, it would spend 30 second trying to evaluate the function on a big mesh. You'd put in all the defaults so it would only evaluate 15 points in each direction or something ridiculous. Robert thought this was completely stupid so he spent a weekend and. He wrote this fast float thing. It works not just on single variable projects. You can make up a function of many variables and it works just as well. X cubed times sign of X squared plus cosign of X times Y minus 1 over X plus Y plus Z. So it's three variables and now you can make a fast float thing on those three variables, and they can call it. And let's see how fast it does that. So 715 nanoseconds to evaluate that expression on these three input variables. So it's very, very fast. You remember the other one was 500 nanoseconds. And this is pure Python. So if you do the same thing in Python, exactly the same expression as I had before, it's over ten times longer. So we're really beating what you get from the Python interpreter. So yeah, he just wrote this, and it's very, very nice. There's an engineer at Newton labs, Karl Witte, it's a place that does computer vision in Renton. And he's been working on making a very sophisticated investigation of this fast float that doesn't require the input to be a float. That's going to be pretty cool. So it will be something like this, but for finite fields and all kinds of other things. You can imagine this will be really useful for maybe enumerating points on varieties over finite fields, that sort of thing. So that's one of those cool little gems that are in Sage. I think I'll just skip this. comparing the answers. This is just doing integration in Sage and Maple and Here's a cool example. So this shows how you can use PyLab, just like you'd use -- or Matlab for loading images and manipulating them. If you do M-read, you can read in an image. Here's a picture of Seattle. What it does, this has red, green and blue channels. It gives you a three dimensional array. Basically it's like three arrays, one for red, one for blue, one for green. It's a single object A that's a three-dimensional array. I'll show you just a little bit of it. So you can do, like, this may print out a 19 lot of stuff. Yeah. So that's the upper left pixel. It has these red, green and blue values. And here, the second thing I did was I took that image and I took just the blue channel, and that's what the right thing is. So you can plot like this. You can also have fun. You can, for example if you take one minus the matrix, then plot it, and it inverts it. So you can do image manipulation mathematically like that. You can also -- I don't know what this means, because I'm not quite sure what inverting a -- you know, a three-dimensional array means, but I've inverted the picture. I actually don't know what that means. I don't know if it inverts each of the three individual matrices that give the red, green and blue channels. I just don't know. I probably should figure that out. But it looks amusing. Okay. Here's an example of something called interact. You can, just by -- if you put that little thing, that decorator, as it's called, at interact before any function definition, then the function becomes interactive, and you get interactive control over the inputs to the function. Notice that this function has two inputs, integer I, which is going to be the number of IGAN values used to compress an image and a bullion flag, whether to display the axes on the outside. And those inputs get converted into controls. There's a slider for I and a check box for whether or not to display axis. If I click the check box, it will redraw the -- rerun the function but with the axis not displayed. If I drag the slider, it will change how many IGAN values are used for compression. There it uses seven. Hence the image is really washed out. Whereas if I use, say, I don't know, 38, then the compressed image looks reasonably good, though, of course, the original is better. If I use more, it's almost indistinguishable. But it's nice to be able to play around with this in your web browser. Like this. This is very, very similar to Mathematica's manipulate command, except theirs doesn't work in a web browser and is a little bit more powerful in some ways. This is quite powerful. We have a lot of different controls. Like if you wanted to choose a color for some reason, you can -- it will give you color selector. So when you click there, then it would -- that variable gets set equal to the color you click on, et cetera. So there's lots of different things like that. And there's little matrices. If you wanted to have a matrix, you could do M equals, say, a matrix, maybe integers and make it a 2 by 2 matrix and there it is. And you could 20 fill it in. And when you change the values, then the function gets rerun, but with M set to those values. So it's pretty cool. It's a really, really useful idea. I would say it's a great idea in Mathematica, but it's not actually -- Mathematica just copied it from what had been done by other people. You can already see this in, like, some old -- there are certainly some old Unix tools for making gooeys that do similar things. And in the Python world, Enthought has sort of their core technology is something called Enthought Traits Library and it does essentially this, but it's just more complicated to use. So it's a pretty cool idea, though, to just make functions interactive like that. And now, here's some 3D plots. So if we make up a function, like that, say sign X minus cosign -- sign of XY minus cosign of X. Let's make it more complicated. Make X to the power of five and what we get is a 3D plot, and you can zoom into it. You can zoom out. You can rotate it around. And so on. The way it works is that there's a java applet that gets embedded in the web page that does 3D plotting. But only uses 2D java primitive, so it doesn't require that I sign any code or run anything that uses 3D acceleration. It runs on a lot of different systems. Here's another example that just illustrates how plotting works. There are lots of different prim tears like spheres, icosahedrons, tetrahedrons, et cetera. You can just add them together and you get a scene. And you can give properties. Like I'll make an orange icosahedron, I'll add it to a red sphere that's transparent, et cetera. If I evaluate that, here's what I get. I get the scene that has those objects in it. So I have my icosahedron, I have my sphere. One of the spheres is transparent, and so on. And as also some cool things, you can set things spinning, like this. So if you want to leave a demo running while you're teaching or something. You can, if you want to give your audience a headache, you can do cross-eyed view. And then everybody -- because I have zoomed in so much so that the fonts look good, it's kind of clipping it partly. But if I -- yeah, let me -- it gets confused if you zoom in a massive amount. That's causing trouble. All right. Yeah, I'm confusing it by zooming so much, I think, Which is unfortunate. Jeez, that does not look happy. It's really clipped. 21 Okay. So I want to finish up so I better not risk too much time on this. I'm just going move on. Here's another example of -- so here's an example -- oh, great. I have seriously messed this up. Let me just refresh. So here's an example of -- oh, weird. That is really weird. Hm. Somehow, by zooming in and out a bunch, I've confused the java applet big time. Yeah, that's not good. All right, well, not so good. Here's another example. So this is just an indexed face set. So it's just a bunch of 50,000 triangles. Yes? >> This is the second or third time you have random stuff. Where Nell come from, and how bad is it, because there are so many bad ones out there. >> William Stein: Ah, good question. There is a -- this guy Karl Witte I mentioned earlier, the engineer that works at Newton Labs, he really likes random numbers. He came up with a nice unified framework for the random numbers. There's a single file where the random seeds are set for the various subsystems. You can do set random seed and the gap random number generators all the other random number generators in PARI all get set. There's one point where you can set everything. There are really good random number generators in the new multi-precision library. There are good ones in Python, in some of our libraries. The answer is there's probably, at least 20 different random number generators. There's also NumPy, which is one of the libraries included in Sage, has about 100 or so different distributions that it implement. So that gives you lots of different, differently distributed random numbers. So the answer is, there are some good random numbers. There are some bad random numbers that are documented, and you -- and you have to look. So the answer, I guess, to your question is it's all over the place. There's a lot of stuff. But at least there is a good framework for setting the seed for the random number generator and figuring out what's getting used. I'll go through this example very quick. I'm running out of time. Here's an example of an e-mail that appeared on Sage Support this Saturday. It's a real world report of number theory calculation. I haven't method about what he's doing. But he's iterating over lots of values and computing the number of numbers with some property. It's the sort of thing that you could -- it doesn't use too much, just iterating over ranges of numbers and checking things like if something is square and so on. 22 The way he'd written it, it was creating a list of a billion entries in memory all at once, which on his puny computer would run out of RAM. So there's a way to just make a one-letter change to the function or, well, actually, change the notation slightly. And instead of creating the entire list at once, it makes a lazy list. And that makes it so his code would run. So I started running, and I told him this will work instead, in Sage support in the million list and then I started it running. On his example, he wanted to do it from 10 the to 9th. I kind of got bored because it was sitting there for a long time. I decided to change it to Cython code. I put percent Cython at the top. I import some C-library function and I declare data types. And I said everywhere T is going to be long long. Instead of using Sage is squared, I'm going to use the C library square root and Y is going to be long long. So I declare some data types. Then I hit shift enter, which I'll do right here. And then I tried this function on a couple of inputs, and for example, for 10 to the 6th, it turns out that it's 238 times faster than the uncompiled version. So this is real world code. This is an actual user on Saturday. And we got that big of a speed-up by using Cython, by declaring some data types and compiling it. >> In the background, is it actually going back and compiling [unintelligible], then? >> William Stein: Yep, here's what happens. It takes -- if you click here, it shows you the code I just compiled. But it turns various parts of the code into C. For example, that code right here gets turned into this C code. And this if check gets turned into that C code. In fact, it's almost for things that just involve C variables and C operations, it's a one-to-one translation. Literally gets identically translated to C so the variables get a funny prefix to obfuscate them. That runs exactly the same speed as C. It generates a C file. Some things that more complicated like A penny to a Python list, that is a couple lines of Python API code. It generates this big file that can be compiled using C. It then compiles it using a C compiler to make a shared object library, and then it links it at run time. So that's what happens behind the scenes. That happens all when you hit is shift enter. There's also a Fortran-- you can place Fortran into this code also. And it will do something similar, but with Fortran code. There's a couple other things like that. For ten to the ninth, his example took 25 seconds. I was able to rewrite it in Cython and run it in a lot less time than it would have taken to just wait for the Python version to finish. And it's not contrived this happened. It's a real world example, okay? 23 Cython, by the way, it's a standalone project that you can get if you use Python in any way at all. You can use Cython. It's a completely separate thing from Sage. A lot of the development of Cython goes on at the University of Washington also. It's still a separate project. For the end, I'm going to show you examples of number theory. Here's making an elliptic curve and plotting it over a random field, which, of course, is just random looking. It's fun to do that, because you're not supposed to. So there you are. Here's an example of computing the [unintelligible] group structure of the elliptic curve over an finite field. Given any elliptic curve over a finite field, Sage will do that. It's using baby step, giant step, basically. John Cremona implemented that. Here's an example of making an elliptic curve over a 60 digit prime field and computing -- displaying it and computing the number of points on it, which takes about five seconds, and this uses PARI, a third party out on to PARI, called sea.gp. Here's an example of computing a basis for space of weight three modular forms of level 12. See right there. So there's a bunch of, all kinds of different number theory related to elliptic curves and so on. What we need to improve is hyper elliptic curves. Hyper elliptic curves and Sage have very little functionality. There's a very little bit. We do have one thing. We can compute the matrix of [unintelligible] acting on the elliptic curve very efficiently and better than a lot of other -- well, anything else. But a lot of basic operations with hyper elliptic curves are not in Sage, which means to be there for crypto applications. Okay. So that ends my talk. And I'm certainly out of time. questions that you haven't already asked? Are there any quick >> Numerical libraries. >> William Stein: Ah, yes. So Sage is very good at that. There are two libraries called NumPy and SciPy, which are mainly funded by and organized by this company, Enthought, in Texas. Let me just show you their website very quickly. And they're included in Sage. So their goal is to provide an alternative to Matlab, based on Python. They're commercial, but everything they do is -- everything they release is BSD licensed. And we include their code in Sage. That gives us a quite a lot 24 of numerical computing functionality. So I mean, you know, they do these NumPy and they do a lot of projects, and the stuff they do is very, very good. Just to show you, just a quick example, you can do, well, you can do things that look like you're using Sage. But behind the scenes, it will use, like, that makes a random matrix with -- that makes a random 500 by -- that's not so useful. That's a random 500 x 500 matrix with double precision entries. Let's say I want to do something like compute all the IGAN values. Hopefully this works. So that takes under a second. And behind the scenes, this is using NumPy, which has stuff that's built on top of law pack [phonetic] and bloss [phonetic] and so on. Sage incorporates Atlas so anytime you get Sage, it has an optimized build of Atlas in it. That gives us numerical linear algebra. There's a ton of numerical linear algebra. If you do import -- let's just say there's a lot. There's also a lot of numerical optimization, et cetera. So that's all part of the SciPy library import. For example, import SciPy dot special, that will show you a bunch of numerical special functions, much of which wraps Fortran libraries, the same Fortran libraries that Matlab is wrapping. So you can see there's a ridiculous -- I mean, I'm going on for pages and pages and pages with special functions that are in SciPy. Or let's do SciPy dot optimize. Import SciPy dot optimize, and then SciPy dot optimize -- oops. And then you see that there's a lot of implementations of things. And they're not just like some cheesy implementation, like F solve. It's pretty sort of big thing, as you can see. There's a ton of different options. And I think this is, presumably this is wrapping some existing Fortran library. I think the government sponsored the development of a lot of public domain Fortran code over the years and that's incorporated into Sage via SciPy. Yes? >> Like what about debugging. Debugging scripts. >> William Stein: There's an interactive debugger that you can only use from the Sage command line. So if I start out Sage over there, then there's -- let's see. There's this project, I-Python, that makes a very nice interactive command line which Sage uses. And it has integrated into a nice interactive debugger. So if I, I don't know, if you take percent PDB and then do something that leads to an error, it will give you a trace back, and it will dump you to the point where the error occurred and then you can start changing variables. 25 So maybe I'll try to make a singular elliptic curve. That will obviously cause some sort of error to occur. And it dumps me to this point of code, line 147 of L generic where it said the invariants define a singular curve. I can actually print them. It's not like I'm running and in that scope. I can type L and see the code around it. I can see that it computed self-dot discriminant. I can call that if I want. I could literally, I think, change the variables. I'm not sure if this will work. And now the discriminant -- oh, because it's cached, probably. But you can literally inspect variables locally and step through and see what happens. It's a pretty nice debugger for an interpreter. And this is often very useful. There's also a profiler you can do percent P-run and see, you know, maybe you're annoyed that creating elliptic curves seems to take too long. So it will tell you what functions get called, what takes the most time, how many times each function is called, et cetera. So there's that. There are some IDs for developing Python code, like there's wing IDE, which is a commercial Windows IDE. There's, of course, eclipse. So here's wing Python IDE. And since, you know, Sage really in a sense Sage is just a Python library. So any tools out there for working with Python, you can use with Sage. So it should have given me a screen shot, but you can kind of see there there's some IDE here. There are a number of different IDEs. Those also have debugging tools in them. But a lot of Sage developers maybe were old school, but we basically use this command line and print statements and stuff, you know. But there's definitely -- well, I saw a took talk on the weekend by a guy from Sun and he said they're putting a lot of effort into improving the tools for doing basically IDEs for Python development, because they think that will make Python a more desirable platform for a lot of developers, because right now, it's good, but it could definitely be better. You don't have something like visual studio or something. Any other questions? me. Because I'm definitely over time. I don't want you to lynch >> Kristin Lauter: So let's thank William, and we're going for lunch in case anyone wants to continue this discussion. [applause]