>> Vic: So it's my pleasure to welcome Shivnath... Duke University. Shivnath got his Ph.D. from Stanford University...

advertisement
>> Vic: So it's my pleasure to welcome Shivnath Babu who is an assistant professor at
Duke University. Shivnath got his Ph.D. from Stanford University in 2005. And since
then he's been at Duke. He's already been the recipient of three IBM faculty awards and
the NSF career awards. He also has I guess in just in this last SIGMOD (phonetic), his
work on the iTune system which he'll talk a little bit about today won the best demo
award. And so with that, Shivnath, we'd like to hear from you today on the
experiment-driven system management.
>> Shivnath Babu: Thanks Vic. So experiment-driven system management is some sort
of technology that we have been working on for quite a while now, around five years now.
And let me first introduce the -- some of the type of problems that we are trying to target
with it.
So one very common problem that sort of shows up in database management, sort of the
context, the problem of tuning problem queries. So imagine you have a complex query,
maybe [indiscernible] generation workload that might be some requirements on it, and if
the query's actually not running fine, right, there might be a user or a DB who might have
to tune it. And probably improves performance by some factor. It's commonly called a
SQL tuning task. So what ends up happening in this context when you have a specific
query that you have to tune up and what DB might do, a usual course of action is to
actually collect some monitoring data. And it might include in last [indiscernible],
monitoring data might include even looking at the plan, you know, what different
operators, the number of records the different operators are returning. We actually do a
large work with booster SQL. And if you're familiar with booster SQL, you know that if
you do this explain an alliance for a query, it actually shows the information which can be
[indiscernible] like this that are -- that shows you the plan, the operators. It shows you
the estimated cardinality. There's the number of records the optimizer talk that operator
might return, and the actual cardinality.
After looking at some like this, you might realize, oh, look, this is now showing
[indiscernible] only run records are going to come actually 700 records are going to
come, and there might be hypothesis [indiscernible] actually have this replacing that
index [indiscernible] join with the hash join might actually improve performance.
Or sometimes it might be looking at these things actually changing the or the
[indiscernible] performance. You might give [indiscernible] to actually [indiscernible] and
whatnot.
But the main point here is to actually achieve such a task to really tune the [indiscernible]
from problem application to actually getting a fix and putting in the system. A lot of runs
might happen, runs of plans like this to observe, to collect more information, to actually
sort of validate some of your hypothesis, and finally once you have found a fix to validate
that fix, that your plan actually works before you can put it in the [indiscernible] system.
Okay.
Another domain. Parameters. In many databases systems, actually most database
systems have these tuning parameters. Sometimes they are called configuration
parameters, sometimes they are called [indiscernible] parameters. There are these
parameters this control buffer pool sizes, I/O optimization, the block sizes and the port
rate to checkpoint, parallelism, the number of concurrent queries you can run, lots and
lots of parameters to treat the optimizer's cost model.
With a system like booster SQL, again, we have done a lot of work with this. We have
find that around 15 to 25 parameters, depending on whether it's a read-only or, OLAP
setting versus OLTP setting can actually have significant impact on performance. And I
think for most database systems, not just booster SQL, it can get very frustrating
experience to tune these parameters [indiscernible] performance and there are known
good holistic parameter tuning tools available, and for good reason.
So here actually I'm trying to summarize a problem that vexed us for a long time. This
actually got us into this whole parameter tuning game. So what I'm showing here is
response surface, so there are two axes. Differ axes. This parameter here whose value
is in megabytes is the main buffer for setting in booster SQL. See. And this one is
actually -- it's an advisory parameter but you're telling the PostgreSQL system, look,
there is such and such value of operating system buffer cache. It doesn't -- you can't
really lie. Nobody is really going to check that.
So we actually vary these two parameters. In fact, we varied many more. And what
[indiscernible] left here is the performance for a workload [indiscernible] just once in the
query [indiscernible]. [Indiscernible] part of the [indiscernible] database setting if you can
actually call it that.
[Indiscernible] TPCS database size one giga fram one is to four ratio, and in fact, this
whole database is running inside azen [indiscernible].
So what actually -- so what sort of got us in here, we've been running the system with
sort of parameter setting that was advised by the community which [indiscernible] set
your shared buffers to around one-fourth of the total memory size and then giving some
memory to the operating system, set this value to the remaining. So now kind of gets it
into right a point over here that performance is really bad. And this actually was I/O, and
the I/O was really -- the system was I/O bound. And we will see that. Okay.
So the natural thing you would think of is increase the buffer pool size, and that will
improve performance. We did that, and performance became worse. So it's like we're
not seeing the surface. We are just tweaking things and then observing performance.
That got us to generate the same task surface. And you can see one of the surfaces
generated, there are some regions where performance is good, some regions where it's
bad. You can see the trends.
I'm not going into details of everything that happened, but this is what I want to tell you is
the moment you see this, you know that I don't have to know what these parameters are.
I can set it for a value and get good performance. But how do real users, really
[indiscernible] doesn't deal with problems like this. There's no model.
Actually, just I want to make one point. When you're changing parameters, a lot of things
can change. The query [indiscernible] different plan. Even within the plan, the operator
implementation might actually change. The impact of buffer boost settings can actually
change. A lot of things can happen. [Indiscernible] into what how we actually diagnosis
the cost of this interesting behavior.
So how do real users deal with this thing? They do exactly the sort of things that we
were trying to do. We were [indiscernible] a parameter. And [indiscernible] is one
parameter time and try to tweak the system to good performance. And this can actually,
if you think about it, this can be horrible if there are interactions between [indiscernible]
parameters where the impact of changing one parameter, the impact of changing let's
say effective cache size changes for different settings of -- different for different settings
of shared buffers. Right? This one parameter at a time can actually lead you to real poor
performance.
So these are two examples of like how management -- system management or database
management happens in the real world, where this is sort of [indiscernible] happening.
I realized based on the model information that I have collected about a system that
performance is bad. I have to tune it. Any general management task, I'm going to give
you some interesting examples. And then you realize how to actually get more data.
And I plan some experiments that I have to do. In the context of parameter tuning, it will
be running the workload for a specific database configuration. Right?
I plan the experiments. I conduct them. And that brings in information. There are
different types of information you can collect. Process it, and maybe, at the end, you
have enough information to achieve your task. Sometimes you don't have it. And you
have [indiscernible]. This process is very, very common, and the sad news is it very
[indiscernible], very ad hoc. Right? Even things like [indiscernible] done these
experiments to [indiscernible] practice system, actually do a run it on my desk system.
What if that is a different setting? So there are a lot of things which [indiscernible] really
struggle, but [indiscernible] work is to sort of automate this process to whatever level it
can be automated.
So I'm going to give you some examples of how the different sort of domains where we
have put this to use and tools we have built. I'm going to be spending most of my time
today talking about that first one, SQL tuning. This is the problem I introduced at the
beginning, and couple of tools that we have built there.
The second one, which is the one that I usually used to talk about, in fact, I was at
Stanford three months back and I talked about this iTuned tool. I'm not going to bore you
with it, talk about the same thing again. So there is one [indiscernible] that we build that
we are actually trying with the one that [indiscernible] actually mentioned. We actually
presented a demo on this iTuned system at SIGMOR [phonetic]. And we are doing some
more work on that.
Down south, the sort of tuning problems actually appear in the Hadoop MapReduce
setting. So this is one simple, you know, [indiscernible] surface and [indiscernible] to it,
but Hadoop today, with each job today at around 190 configuration parameters, just like
in a database system at around 10 to 15 of them can be really crucial for performance
that are infractions and whatnot.
So there's something that we're actually studying and building a tool on.
One of the problems that we attacked the very beginning and a very common problem
that we see is this whole notion of benchmarking on capacity planning, where in this data
might have again some -- so this is actually for a storage server where we are waiting it's
running the FS file system. We are waiting the number of NFSV [indiscernible] and the
actual number of disks in the storage system. You can see that some performance
aspect is being measured against either this dimension has a [indiscernible], this
dimension doesn't. So in benchmarking, often people are interested in capturing either
there is some map of how performance will be or which parameters are important.
Right?
Don't want to spend a lot of time doing this. In fact [indiscernible] report PPC hedge
performance numbers, you might spend a lot of time doing something like that.
And what we realize is that you don't really need this exact surface. It can take months
or even days to actually generate this. You can giving a quick approximate surface and
often be very useful or defining perform parameter. And we did some work which we call
cutting corners. Right? Ideation [indiscernible] generating the surface, but you can cut
corners and get an approximate [indiscernible]. That's often good enough. And take
one-tenth of even smaller the amount of time.
Then the problem that we have attacked is this thing which I'm calling in traction
[indiscernible] scheduling. So we all know that other query [indiscernible] generate
masking at one query at a time. The moment multiple query plans are running together,
who knows what's going to happen? Maybe they actually help each other out. Maybe
they beat up upon the same results, performance can be poor. Schedule [indiscernible] a
pair of this in traction can do a really good job. But how does it know about this different
attraction that can arise when queries are running in a mix?
So for that, you need models. Who's going to give you these models? The only way you
can actually do that is to run 40 or have your system for taking different mixes run some
of these mixes, observe, and build the models will work any old context. Nothing will
[indiscernible]. Right?
So we have attacked actually that again using the running of mix, collecting information,
selecting which mixes to run because [indiscernible] is huge experimental in
management was really great.
Number one, we are actually focusing on now is data in a database system or in a file
system can easily get corrupted today because people run things on [indiscernible]
hardware where hardware can actually be very flakey. You know, it can [indiscernible]
bugs, network interface [indiscernible] bugs. All [indiscernible] actually can have -- make
mistakes. So because of a lot of reasons, data can get corrupted, and we have this very
interesting example I'm going to -- now I'm going to tell you where it is from.
But there was a typical application which couldn't be -- it actually crashed and it couldn't
be brought up for a long time, virtually a long time because what had happened was that
corruption happened. Somewhere the data for that application had become corrupted.
Back up and -- this [indiscernible] everything was in place. The backups actually became
corrupted because nobody was really checking. Right? And the application crashed.
We tried to bring it up from the backup. The backup was also corrupt. So literally, a lot of
data was actually lost because of things like this. And again, how do we attack
something like this?
There are data [indiscernible] both in our checksums and there might be loss, right? The
things that you can actually run to detect it and give advance warning to administrators.
So that's again enter and trusting the main issue you're actively working on where this
whole [indiscernible] ruined because you're running tests, you're getting some information
about the test results and then figuring out maybe if this snapshot is wrong, this is
[indiscernible] maybe the database could also be corrupted actually bring in and lose
some interesting information.
So after building a whole bunch of these things, what we realized is there's actually a lot
of [indiscernible]. All of these are [indiscernible] twos. There's a lot of commonality in for
instance at the level of maybe scheduling experiments. Maybe the whole [indiscernible]
harness experiments. A lot of commonality in the planning algorithms.
So we are like trying to raise a level of [indiscernible] back to the level of the language.
And I don't know how much time I will have, but I'll try to at least give you some of the
main ideas. And the way I usually do these talks is I pick one system after having given
the overall vision, right, I pick one tool that we have built and then use that to sort of -- as
an end to end thing to illustrate the different aspects of planning, running the harness and
everything.
And what I've chosen to do today is this Xplus SQL-tuning-aware optimizer and at the
end I'll actually talk more about, depending on the time I have, some of our current focus.
So this is the same problem I introduced in the beginning. You have a problem ready
that you'd actually like to tune on the admin. He needs to improve the performance by
some factor. And what happens today, actually, a lot of this happens manually when
they do things.
Wouldn't it be great if the administrator could instant pose the question back to the
database system. Look, I have this query I have to do. Okay. Can you -- and I'm not
happy with the current plan. Right? I need to improve its performance. Can you
generate a better plan for me?
So this [indiscernible] is actually one small part of this overall SQL tuning. Sometimes by
relighting the query, sometimes by adding more indexes or sometimes by tuning this
parameters, you can improve performance. You can actually fix this problem, but all of
them are invasive in different ways. Adding an index can actually affect the performance
of other queries. It can affect workloads. Right? The update ratios and things like that.
So we focused on this smaller piece of the problem, and what we were able to show is if,
you know, you can find such a plan with some caveats, it sort of means that there's no
other [indiscernible] than some invasive tuning. And what we have built is a real
optimizer. But suppose the regular optimization optimizes goal is given a [indiscernible]
generate a good plan, right, [indiscernible] all of that. But it's supposed to [indiscernible]
what we call tuning sessions.
So in a tuning session to our system, the inputs that are given are of course ready.
[Indiscernible] performance is not good. If you have the plan, the current plan for this
performance is not good, you can give that as well. We actually assume that is available
as of [indiscernible] current prototype. And then these tuning objectors, so they support
different tuning objectors like improving performance by some factor or have one R. I
need a better plan in an our. To find the best plan you can find in an hour. So we
support multiple tuning objectives.
And the way expressed actually works is by carefully choosing some plans or some
subplans of this query to run. It runs them, collects information, and iterates. It's
basically applying that same old exponent [indiscernible]. Okay.
And because it's running [indiscernible], we always have -- our tools always have this -the number three. [Indiscernible] which is so tuning is going to base some or have
somewhere. Right? The admin or the user would like to specify under what constraints
to contain that might actually be.
I'll give you an example. So one really simple example is if you have a database system
that has some multiple [indiscernible] level of ten. Now, you can use one of those
[indiscernible]s to run the tuning subplan. Make it just one very, very simple example.
Somehow you want to limit the overhead of tuning. This type of tuning is not free. It's
going to pay some overhead somewhere. Maybe it's in the cost. In the cost of a cloud.
Maybe that [indiscernible] is possible in a system. Look, I have $10 to spend an hour.
You can get results from the cloud on easy to and do [indiscernible]. Right?
So there are different types of entrusting that was consistent and [indiscernible] harness
that we have not prevented.
So just to give a quick example so that we are all on the same page, assume this is a
subplan that a system chooses to run for the [indiscernible] ready. Hash join on two
tables. So even before the subplan is run, we can have the estimated cardinalities as
estimated by the optimizer. Remember of course they're going to come out.
At the end of the run, we actually collect two pieces of information. One is the actual
cardinality, how many records have -- were produced [indiscernible] plan, and then
something which we call estimated actually cardinality. So it's the 670 is actually if the
query optimizer knew that the tables cannot -- R tables on S are going to produce
thousand 50 and 850 records, respectively, this will be its estimator, not this.
So if you see this, the 670 is pretty much close to 700. What that means is the -- you
know, for the join, you also have to estimate the join selectivity. Right now just the input
cardinalities. So the join is [indiscernible] estimate was more or less accurate.
And the user. So essentially, as of now, be our optimizer uses these three-piece of
information. We have much more because when we run this, we can observe how much
you have [indiscernible] the bottom of I/O, [indiscernible]. These are all information we
have but we have not. This optimizer found a way to plug it back in in a simple manner.
Okay. So how does Xplus actually work once this inputs are given? It starts running
these plans, and as it is running them and collecting information, being progressively it
keeps on doing that. It might have better and better plans. So we're not actually showing
there as a cost of the best plan found so far. The costs are -- keeps on going down. This
is not the cost of the best plan as estimated by the optimizer at the A point.
Its performance can actually go -- become worse because the optimizer is mixing
accurate cardinalities with estimate cardinalities and sometimes that [indiscernible] be a
good thing. Often -- I shouldn't say often. Sometimes it's not a good thing. Right?
The user can actually stall this at any point. Say, I'm happy with this plan that was found.
Or run it to so-called completion point, then we are able to provide a sort of strong of
tunality guarantee.
Okay. So I'll tell you actually when -- what happens, under what conditions we reach this
point. Can be reached literally efficiently, but we can say look, the plan I have found so
far, the best plan I have found so far is indeed the optimal plan for the current database
configuration, and that includes the current physical design indexes, the current sober
configuration parameter. We just on the large scale configuration. And the current
hardware [indiscernible] have been allocated to it.
So we are not changing that. We are not playing any games with that.
And each and every plan goes to using accurate cardinality values. So every plan has
now been cost [indiscernible] optimizer using accurate cardinality slow estimates.
So if you are willing to [indiscernible] that -- and this is actually at least in DB two, they
have some research verifying that and we have -- to some extent, we have done this for
PostgreSQL two. If the accurate cardinalities are given, the cost more or less do a
relatively good job in -- in a relative sense, finding the best plan or ordering the plan's
right manner, at least the good plans.
So this is again the -- something which we definitely want to take out because sometimes
the cost model can be inaccurate. Right? Or the parameters and the costs probably
actually plug in, but that's not being such a big problem so far. The main reason we have
the first orders again as I said, we want to be noninvasive. In the sense we won't actually
suggest a fix that would work with the current setting. [Indiscernible] other pieces of work
that actually deal with that, in fact, all the parameter tuning work.
Getting the plan right is really the first -- the most important thing for performance.
Parameter tuning, all of those things are second rate to some extent.
So there has been actually whole bunch of interesting work, and some of the main
challenges in this context are which subplans you're going to run, [indiscernible] actually
run them, and [indiscernible] of tuning on the production workload.
There has been interesting work in this domain [indiscernible] work done by
[indiscernible]. Right here at MSR. So we have actually separated our work from some
of the earlier work, the query execution feedback work happening. But some -- for some
practical reasons. One of the main reason which we separate our workers, we are very
focused on a query. So the DBR user has given us, this is the query I want you to do.
And that is the sort of context we see in the PostgreSQL community.
Often 95 percent of the queries work fine. There are some of these, you know, killer bat
rays. We actually focus on that. And we are very consciously chosen to operate these
so-called tuning sessions from the regular optimization for more or less the same
reasons.
And an important reason is, you know, I was having this discussion with Rene. Was
telling me that even if you bring in this query execution [indiscernible] and put it back in,
there's no guarantee that the plan optimizer will actually find some accurate cardinalities
and the rest has been estimated. There's no guarantee that that plan will actually be
better than the previous plan. And so there's a lot of plan error, funny things that can
happen. So we have chosen to separate these things out. Clearly separate out. Tuning
sessions, use it and when you know how to use them and when you really want them.
And another thing, and this is some thing that we got when we presented this work, the
very first version of this work more than a year back, the DB test workshop in 2009.
People don't -- so this sort of proposing a whole bunch of changes, right, people like I
guess from the commercial database perspective. The more changes you propose to
[indiscernible] there, the less likely is going to actually be adopted.
So we definitely, since we are working on optimizer, we want to make zero changes to
the query execution engine. And interestingly, actually also going to tell you how we get
away by making zero changes to the commercial [indiscernible] database query
optimizer. Might sound weird, but there's a way in which we actually do that.
So this whole problem and some of the idea that we use especially running plans and
using feedback is not really new.
What I'm going to spend a lot of time focusing on, this big challenge of which plans, large
space of plans to actually run to reach a tuning goal. And the main [indiscernible] the one
to a single main idea in [indiscernible] is actually how we treat this overall physical
planned space.
So imagine this were to be run test space of physical plans. So P one through P eight
are different physical plans. We actually try to group these plans together into what I call
plan neighborhoods. And the plans in each neighborhood are related in some specific
manner. There is some literature that we won't exploit.
The key to this is the notion of cardinality set of physical operator or a plan or a
neighborhood. Okay.
So let me explain what a cardinality set is. So consider the simple, you know, actually as
an operator hash join that works on two tables is a filter on one of these tables. So the
way in which the query optimizer estimates the cost of an operator like this is it has some
sort of cost formula to which you need input and some the pain inputs are the
expressions and especially the cardinality values for expressions.
Let me give you an example. Right. So for this hash join, in PostgreSQL, the -- to
estimate the cost of the hash join, the optimizer needs the sizes of its inputs. So in this
case expressions are the signal R and S. Right? For these expressions, it needs this
blue, the cardinality values. And it plugs that in along with some other parameters like,
you know, all of the [indiscernible] represent one the buffer size and the [indiscernible]
size in all of this and [indiscernible], estimate the cost. Okay.
So we define the cardinality set of a physical operator as the set of relation algebra
expressions. Expressions, not values. Okay. And this is -- you know, this is a distinction
I just want to emphasize. So the expressions whose cardinalities submitted the cost at
operator.
For a plan, again you can sort of estimate. Right. Just laid out the plan, and again for
the plan, that will be -- for each operator, there will be a set of expressions [indiscernible]
set you [indiscernible] of all of these expressions and you get the cardinality set of a
physical plan. Again, expressions, not values.
And once you can associate a set of expressions with a plan, you can actually say, look,
I'm going to divide up, I'm going to start off in a cluster of this set of plans based on
equality of car 96. Okay. I'm going to use the simple definition. In fact, we actually have
a more sort of complex definition of planned neighborhoods because we want to
minimize the number of neighborhoods. We want these neighborhoods to be as large as
possible. Each neighborhood to be as large as possible. But to keep things in a simple,
we're going to see for this [indiscernible] that our definition is a neighborhood contains
plans, all of which have the same cardinality -- yes, [indiscernible]?
>>: [Indiscernible].
>> Shivnath Babu: All of them. Yeah. Because the object of this really, the structure
[indiscernible] the plan space, now I have to talk about how does the optimizer make
progress for a tuning goal. That's where all these other things come in. Okay. Any other
questions?
So the [indiscernible] here is [indiscernible] is well, if I have added cardinality values to
cost one plan in the neighborhood, I can cost all the other plans accurately.
[Indiscernible] this is the relationship. [Indiscernible] simple model.
And naturally, so our notion of a cardinality of a neighborhood it was trivial here is the you
know of the cardinality sets of all plans in that neighborhood which again is trivial here
because all plans have the same cardinality set.
A simple example. Two plans. Just a little different, but in reality have the same
cardinality set. At least in PostgreSQL. This plan is three-day join with some filters and a
couple of hash joins. This one exactly are the same tables, same filters. There is a hash
join being placed by a merge join. There's a hash join where the order has been flipped.
So these sort of things don't really affect the cardinality set.
So a single cardinality set actually consists of a whole bunch of different plans.
>>: [Indiscernible].
>> Shivnath Babu: And communication. But it's really function of your, you know, the
optimizer's cost model. Yeah. So that's exactly right. So basically within the same
[indiscernible] PostgreSQL, within the same neighborhood essentially all possible
replacements of operators [indiscernible] to include indexes and things they access and
the premiere [indiscernible]. So this will define that neighborhood. In fact we use
transformations to generate [indiscernible] that we -- within the neighborhood.
>>: [Indiscernible].
>> Shivnath Babu: How's the -- so they are really talking to the level of physical plans,
not really the logical one. So if you're talking the level of an expression, then -- so I want
to -- yeah, I think that's basically one way to categorize it. [Indiscernible] group as they
understand it, is really a logical subexpression.
Okay. Now, so once we have that he is neighborhoods, now we are starting to talk more
about the -- yes?
>>: [Indiscernible] the question [indiscernible] class of complexity that you're thinking but
[indiscernible] group does not. That was the [indiscernible] ->> Shivnath Babu: The [indiscernible] group is actually sort of representing all possible
physical plans for that logical expression.
>>: But there's just different plans but [indiscernible] complexity except through this
physical [indiscernible], you [indiscernible] expression which is affecting the cardinality
[indiscernible] with that property.
>> Shivnath Babu: Mm-hmm.
>>: [Indiscernible] and that is a stronger notion than a typical [indiscernible] group
[indiscernible].
>> Shivnath Babu: Okay. Well, say that that way, essentially the grouping is all sort of
plans which can be in a cost using the same set of cardinalities.
>>: What is the difference [indiscernible]? Isn't it just captured by the [indiscernible]?
>> Shivnath Babu: No. It could be that -- just to give you an example, so assume
instead of the merge join was an index [indiscernible] join, right? To cost [indiscernible]
join, a case in PostgreSQL, you don't know that you -- you need not know the actual size
of the number of [indiscernible] on the probing side or on the -- yeah, on the index
probing side. Right? You only need to know an average how many records ->>: [Indiscernible].
>> Shivnath Babu: -- [indiscernible], yeah. So the [indiscernible] physical was logical,
there are ->>: [Indiscernible].
>> Shivnath Babu: Yes. Yeah. Okay. So the [indiscernible] why all of these
neighborhoods, right? The main reason is the way express R makes its progress as it's
trying to tune of query can be cache [indiscernible] in terms of how it goes about what we
call covering neighborhoods.
So a neighborhood effectively covered then accurate cardinality values are available for
all expressions in this cardinality set. Okay. So if a neighborhood is covered, then all
plans in it can be costed accurately [indiscernible] cost model. Okay.
I can think of this sort of express progress, this is the picture I showed you earlier.
Express progress is imagined physical plan space already had full neighborhoods. It
runs, a process might actually end up covering this 1 first and then this one and actually
when it runs a plan, multiple neighborhoods might get covered together and the end, the
optimal -- optimality guarantee you get all neighborhoods are covered.
So to actually -- in the process of achieving this, there are some nice efficiency and
guarantees that express can actually make. The first one is it runs at most one plan to
get the accurate cardinality values for a neighborhood.
This might seem trivial, right? Because the way I define a neighborhood was all the
plans which have the same cardinality set. But this is subtle distinction which I'll get into
in a moment because the -- if it is the case that you can efficiently [indiscernible]
cardinalities for all expressions that are needed to cost the plan, so, you know, in one
case we are seeing these are things you need to cost the plan. In the other case we're
seeing that he is are things we can measure when the a plan runs. This is sometimes
different. And I'll get to it. Right?
But we have ways in which we actually I'm sure that at most one plan, and the interesting
thing is often not a plan. It's actually a subplan.
And another thing is we have this feature that the moment a plan is actually run in a
neighborhood and it gets covered, so accurate cardinality values of measured for some
expressions, we have a way in which actually data [indiscernible] enables you to find the
minimal set of other neighborhoods which have -- their plans have to be recosted
because you now have accurate cardinality value for an expression there whereas you
only have an estimate over here. Okay. This is just a data sub[indiscernible].
Then, all the neighborhoods can be covered by -- so the typical thing is to run one plan
per neighborhood, a plan per neighborhood to cover all neighborhoods. It's
[indiscernible] because of the intersection and, you know, putting some [indiscernible]
properties and the other thing. You don't have to run plans for each neighborhood. In
fact you end up running very few plans, just to give some numbers for query nine in
TCPH that are around 36 neighborhoods and 8 or 7 or 8 subplans need to be run to
cover all neighborhoods.
So it's basically, you know, you can exploit all of these properties of -- of a DS setting to
actually really cut down on the number of neighborhoods. Yes?
>>: [Indiscernible]. If I execute these plans they get the cardinalities from [indiscernible].
>> Shivnath Babu: Definitely. So the way I actually say that if my tuning goal was to
measure all cardinalities, so together -- to get this point as of surely as possible, right?
Doesn't exactly the thing you would want to do. [Indiscernible] knows ahead of time. I
want to figure out how much I can [indiscernible] performance improvement I can get,
right, so that I can avoid all invasive things. That would be a ->>: [Indiscernible].
>> Shivnath Babu: Yeah. But that's how it will be. It could be applied, yeah.
>>: [Indiscernible].
>> Shivnath Babu: And sometimes after running experiments we have [indiscernible]
trying to [indiscernible] results and if you want to get to that point.
Okay. So essentially once subspace is covered, we actually get this property. Okay.
So now, let me just take a step back and ask about [indiscernible] works at some level.
So what are the challenges, right? What are the sort of [indiscernible] problems that had
to be solved? How do you [indiscernible] all these neighborhoods? Right? How do
[indiscernible] plans within a neighborhood? [Indiscernible] in a neighborhood to actually
run our subplan.
In terms of progress, I have -- I'm at some point, always the case [indiscernible], there are
some point there are some information and they are figuring on what to do next. So
which neighborhood to go after next. Right? And once you pick a neighborhood, which
plan to run or subplan to run within that neighborhood to get the required information in
that neighborhood.
So I'm going to go down and do each of these questions in some level of detail. The -it's a [indiscernible] one and a half years of maybe even closer two years and 1 complete
rewrite to actually get to this point because a lot of high level ideas might seem too easy
and probably even straightforward, but to actually get it to work and to really implement
[indiscernible] optimizer [indiscernible] can actually [indiscernible] a lot of times. So I'm
kind of trying to stay away from the details, but I'd be really happy to either talk about
those [indiscernible] off line. Okay?
So the first problem. How do you enumerate these neighborhoods? All right. And when
the enumerating [indiscernible] neighborhood.
First [indiscernible] try to [indiscernible] because we have PostgreSQL, which is a
bottom-up fully optimizer. Right? Same old plans trying to impress some clustering that
just [indiscernible].
So a transformation based on Postgres is the best thing in such a setting that you can
carefully select and do the plans. The only plans you can afford to [indiscernible] -- the
way to go in here and what we have done is we have actually [indiscernible] a sort of a
transformation based process [indiscernible] optimize if you want to call it that. But the
[indiscernible] thing is trying to write these transformations and these are physical plan
transformations. And being able to characterize them into one of two categories.
So this transformation applied to a physical plan will generate a new plan in a different
neighborhood. And this transformation applied to a plan will direct a plan in the same
neighborhood. So the logical -- the commuted transformations replacing physical
operator transformations with some corrections [indiscernible] physical properties. That's
how -- that will actually fall in here. Right.
So changing join [indiscernible] and those sort of thing will actually fall in there
[indiscernible] pushing up [indiscernible] expensive pushing down [indiscernible], which
we have not really done, but they will go in there. [Indiscernible] doesn't have that. So
we will like it.
So this other key question is how do I decide at any point if time based on some
neighborhoods that have been covered which neighborhood to cover next? Okay. Let
me illustrate this sort of deletion with an example. So suppose I have covered that green
neighborhood in one and I have these three neighbors I'm trying to decide which one to
go after next. And let say that we have found based on the current set of available
cardinality, this [indiscernible] and actual that P two, P three, and P four are the
respective least cost and similar optimizer physical plans for each of these
neighborhoods. And again, these are uncertain because there's not mixing easy but the
accurate but estimate cardinalities.
And suppose just again that in N two we have this -- when you look at the level of the
cardinalities set, we have in these two neighborhoods two cardinality values that are
missing. Right? And in here, there are three cardinality values that are missing.
So this actually kind of poses [indiscernible] like should we go after N two next because
that is where our current cheapest plan so to speak is -- exists. Right? And N four
because this is [indiscernible] cheapest plan estimated. Could be totally wrong. Right?
N four, if you were to go after N four, there's going to bring in more sort of, you know,
accurate cardinality values so it's going to convert more uncertainty or uncertain values to
certain. Right?
This is the exploitation expiration problem manifested in SQL unit. So in the expiration
exploitation problem is really like have some variables for whose values have some
estimates as of now. Okay? And trying to make an addition based on that. So
exploitation will be a sort of saying I know the uncertainty of the extreme exploitation
would be. I say I know the uncertainty. I assume these are all accurate, and then go with
my addition. In this case, that would be the -- going after the current and cheapest plan.
Exploitation would be a [indiscernible] uncertainty. And I [indiscernible] resolve that
uncertainty. And there are multiple -- and this is actually -- took us a long time to sort of
converge on some sort of an architecture here. And our current sort of best solution is
[indiscernible] solution that will be used this exposed [indiscernible] policy. Actually a
well-known technique in this [indiscernible] area. Yeah?
>>: [Indiscernible].
>> Shivnath Babu: Why don't you ->>: [Indiscernible].
>> Shivnath Babu: Yeah.
>>: [Indiscernible].
>> Shivnath Babu: So that can be done. Right? I think this is where we in some sense,
when we press in this first level of work, the sort of thing that regardless, there are some
standard techniques that people use when they're going after sort of tuning. They look
at -- they cannot look at join select [indiscernible] values or the values that sort of come
out and based on that they figure out oh, this actually seems like a better [indiscernible]
try next.
So there are in some sense, right ->>: [Indiscernible].
>> Shivnath Babu: Yeah.
>>: [Indiscernible].
>> Shivnath Babu: I'm not actually going to come in and say, look, this is the first we
have is the best. Right? But from a practical standpoint, we probably have to try multiple
things and actually I think our first one was really, at least my thought, was on
[indiscernible] trying to actually capture the benefit because in the whole previous line of
work on iTune and everything, that's exactly what we have done. Trying to quantify
benefit based on the uncertainty. And, you know, coming up with the close form and
going after that.
>>: [Indiscernible].
>> Shivnath Babu: Actually in substance, being when you see a solution, something like
this is what we have. So this notion of exploit was really -- each export -- let me show
you this thing. So two exports you can think of are exactly what I showed you in the
previous slide which is the pure exploiter is always going to just this neighborhood where
the least cost plan lies. Go after that. Right? And the pure exploiter will actually go after
just that neighborhood where the maximum number of uncertain cardinality values are.
You can actually make them fancy. You can try to get some ideas of the uncertainty and
[indiscernible] those and whatnot, which actually get pretty hairy. You can actually take
them and implement other versions of these exploits. And after some purposely for that
reason is very extensible. Okay?
And there's going to be a selection policy that is sort of going to arbitrate among these
exports. So the challenge of the sort of [indiscernible] would be to figure out for instance
for SQL server what would be a good [indiscernible]. And even, you know, sometimes it
could be that as a optimizer [indiscernible] might actually see cases that are coming to
you for a solution, see a specific type of scenario where documents [indiscernible]
making a mistake. So there's nothing preventing you from writing a new sport that can
see this area and suggest a good strategy for that setting.
We have [indiscernible] on this architecture primarily because in that -- in that
[indiscernible] workshop you're talking real optimize of people from sigh base and they -they sort of let us [indiscernible] but I'm not going to say that this is the approach you
should take.
More what I want to sell is this overall [indiscernible] of [indiscernible] being applied to
solving the problem. And of course they can actually have other exports which have
some notion of some mix of exploitation and exploration. Right? And again, I don't have
the time to get into details but these two sorts of experts sort of look at the joined
[indiscernible] for instance. Remember, the estimated -- the estimated actual cardinality
kind of showed you some time back where the -- looking at the particular operator, and
after having gotten the accurate cardinality values for its children, you can ask the
question if the optimizer [indiscernible] know the cardinality values of its children, then
what would be his estimate of the cardinality of that operator? Right?
So this sort of takes into account what is the error you can compare a sure and accurate
cardinality with accurate cardinality and get an ideal which joins -- whichever joins
[indiscernible] optimize [indiscernible] join was less selective than it is the other way
around. And these guys take [indiscernible] exploration into account to suggest a
potentially better join order.
And then there's a selection policy. So I want to see this exports actually begin in
[indiscernible] neighborhoods which exploit to go off to at any point in time. There are
again standard things you can actually apply. The more fancy one would be a
reward-based selection policy which sort of keeps history of how good the exports were
in the past and tries to pick an ex-sport which was actually working better in the past.
Okay. So that brings us to the -- to this other problem of once I have decided I'm going to
go after this neighborhood, which plan to pick in that neighborhood.
So in a sense the way we have cast this problem is saying look I won't actually find a
subplan to run in that neighborhood which will bring in the unknown cardinality values as
[indiscernible] as possible.
So if you look at -- imagine this to be a plan in your neighborhood and as of now, the
current -- the set of based on the cardinality values for which you have accurate -cardinality expression for which you have accurate values. Let's say that only ones that
are missing are S and the join of R and S.
In this case of course you don't have to do this part of the master plan. Right? So there's
essentially a subplan identification which will kind of bring in the values you'll need.
And this other issue that I mention is, imagine this were to be the [indiscernible] in this
neighborhood is actually a indefinite slow join. This [indiscernible] are going to be the
cheapest plan.
[Indiscernible] join is not going to give you the cardinality of S. Right? Because it's not
going to scan the [indiscernible]. It's just going to probe it. So there are some tricks you
have to play depending on the complexity of the execution engine and we have decided
not to make any changes to the actual execution engine. And do everything from the
outside.
So what we'll do is we'll go through each plan in this neighborhood and find what is the
smallest plan with required modifications, if I may [indiscernible] the plan also a
modification. It will bring in the cardinality values that we need. That's roughly the
problem in this [indiscernible] here.
Okay. So that brings me to the -- these last part of express which is the architecture and
the implementation. Since I'm running out of time and I see I have ten more minutes to
2:30. Is that -- does somebody have --
>>: [Indiscernible].
>> Shivnath Babu: Okay. So architecture looks pretty much similar to [indiscernible] I
introduced, something for exports and selection policy then [indiscernible] keep
[indiscernible] all of these values, confident that once the neighborhood [indiscernible]
picks the plan with the modifications and whatnot. Does the plan costly enumeration.
And this part is really the -- you know, once a plan has been picked, a plan has been
picked, the part is actually going to run that subplan and correct information. Right?
So this is really tied to this other infrastructure that we [indiscernible] where you can -- if
you only [indiscernible] experiments on a cloud or on a desk system or on the
[indiscernible] system, this is a separate entity. And it's [indiscernible] can actually -essentially think of this part as finding out the plans and this guy actually running it,
scaling it [indiscernible] possible and giving the information back.
And the controller is the thing that is driving everything. And we have implemented a
number of different controllers.
So this is the architecture. We have implemented [indiscernible] for PostgreSQL and the
hope is to release the code. PostgreSQL, unfortunately we did it for 8.3.4 version. They
released nine recently. And they've actually changed the -- so the -- there was this
interface that we had implemented through -- well, actually I think it will become clear
when I show you something else.
So we wanted to make it useable for PostgreSQL. That is how our in the main
[indiscernible] we've been focusing on because it's open source. But I definitely like to
get back on whether something like this would be interesting or useful in the case of SQL
server.
One importance on a feedback we got, Postgres is a challenge to us. So the focus at DB
[indiscernible] actually told us, look, going to change, going to propose this architecture
for an optimize. Look, nobody is going to take this. This looks horribly complex. Right?
But they arranged to sell us a tuning tool which lives with the kind of optimizer and maybe
you start off -- start your life as a tuning tool and then slowly incorporate in a optimizer.
That we are willing to take.
So that actually we kind of shaped the architecture. And what we have now is essentially
this optimizer [indiscernible] has its own job opposite occasion. It's a tuning tool that we
can use and there are -- there's a specific very defined interface which essentially looks
like the -- this tool needs the [indiscernible] from the database to actually give it plans and
get the costing, get the plans cost maybe with some accurate cardinality values I can give
you and also a figurative plan and implementing this in Postgres required some changes
to the [indiscernible] which they have sort of taken that part out. So we have to now
implement a whole new [indiscernible] implement this interface for nine and we have not
done that yet. Yes, Lee?
>>: Was this [indiscernible].
>> Shivnath Babu: We have [indiscernible] that. Essentially if you think about it, in the
[indiscernible] optimizer, that definitely is its functionality is taking cardinality values and
estimating for plans right? We only have done this is taken this internal EPA and sort of
push it out so you can call it from a client. So then that's why I said zero changes to the
PostgreSQL optimizer, right? Although we implement new optimizer.
And two other things we are focused on a lot is the extensibility and the system is
extensible of the level of the controller. We have implemented multiple controllers. The
experts controllers [indiscernible] described, both running in a single mode, but exploit is
like us to any point in time a parallel mode as well. We have two pieces of work, related
work. The IBM [indiscernible] work and the article automated, automatic tuning optimizer
work. The reason we didn't implement the BS [indiscernible], we have not willing at this
point to make changes to (inaudible). But if that's something and that introduces that, I
think we can actually combine these two and [indiscernible] much better things because
you get more information when you run subplan or a plan.
The other thing is efficiency because in this whole [indiscernible] work the reason our
papers used to get shot down is because we were say, look, it takes a long time. Tuning
parameters are in some sense I think mentally it seems to me that a lot of people have
[indiscernible]. I want tuning, I want this combination to be very fast. That's now how
things happen, right? Because [indiscernible] spend weeks doing these systems.
But at the same time, we took that as a challenge to always in our tools implement this
so-called efficiency features, parallelism, subplans and a host of other things, right?
Have a big laundry list where we have spent a large engineering effect to get good
efficient see.
So how do we go about evaluating all of this? Well, evaluating tuning turns out it's not an
easy task. So we have developed a benchmark. So what we did is we [indiscernible] a
whole bunch of people, including people at DB test, and now our colleagues have sort of
demoed this, you know, set of tuning scenarios. So it's tuning scenarios undefined based
on data and based on some potential root causes that can arise and practice like -- there
could be a query-level issue. There's this user defined predicate in there which is very
hard to estimate a selective default. Our data level issues. Skewing the data.
Correlation of the data.
So this is some issues. The difference may be still are missing. And [indiscernible]
issues that might be [indiscernible] optimize is not taking a bad index or not. And for
each of these things, there's also an objective. Right? And one objective might be
finding the optimal plan [indiscernible] said actually wants us to find the best plan
because they're trying to figure out for the current setting how far they can -- what is the
best performance they can achieve [indiscernible] tuning [indiscernible].
Let's just say improve performance by five X. And there are different aspects of
evaluation let's see some results.
So four out of what is happening [indiscernible] and the query serve and TCP hash, so far
one of our tuning scenarios, PostgreSQL, planning the PostgreSQL optimize and finding
a plan and [indiscernible]. [Indiscernible] and 257 seconds.
Learning Xplus. Xplus manages to find a plan that runs in 21 seconds. Okay. But it's a
factor of 12 improvement which is not really surprising [indiscernible]. It's running
[indiscernible] and actually seeing things. And the main thing wits actually taking the
most time for us is running some of the other numbers, so it actually runs for this query
only six subplans. No full plans, only six subplans to tune it, and the all over tuning time
is 131 seconds.
So the way to look at this number, it's going to be a big number. But the way -- best way
to look at this number is look at the ratio this number to that one. So how many times will
that bad plan sort of have to run and compare that with how much time it took to you
actually find a better plan or this other plan.
So that [indiscernible] and a half.
Similarly, now focusing at the workload level, imagine these eight queries where the
really bad queries in your work loud and they're taking around 97 minutes overall to run.
Exploits after running is able to cut down the workload running time to 32 minutes. In
24 minutes. Okay. Just around 1-fourth of the time that takes you to run that workload.
So the other sort of tuning aspect is whether -- where the administrator wants a specific
performance improvement like in this case we're actually looking at five X improvement.
All right. So Xplus, it's the same value or same scenario that we saw earlier in half the
time it takes to run that master plan is able to produce 12 speedup.
So if I'm looking for five X speedup, this definitely is able to produce satisfactory plan.
Okay.
IBM, the Leo and ATO because they're actually not really focused on a single query and
there are -- the problem with Leo and I think this is the general problem that can arise
[indiscernible] queries [indiscernible], what you can measure is basically determined by
what plan you're running.
So imagine you run a plan and you get some accurate cardinality values. And throw
back into optimizer. Still makes other plan in that same neighborhood. In this case it will
run it again. You're monitoring opportunities actually, at least in the case of
[indiscernible], you know, which I [indiscernible] out as zero. So you get stuck. So you
can actually get stuck and it found a plan with 3.2 speedup. Right? And ATO actually
found one with 4.2 speedup. ATO is a more expert [indiscernible]. It runs, it plans for
each single table and counts each [indiscernible]. It's not really scaleable at some level.
Okay?
Now other important why we have shown this in red is that if Leo [indiscernible] found a
plan with 3.2 speedup, which doesn't meet my requirement here, but I don't know
whether there's even better plan. Right? So in cases [indiscernible] it can't fail because
there might not be a plan with FX speedup but a case you know subject to the
assumption that optimizer cost more is accurate but accurate cardinalities, at least you
know that probably now you need to invest in noninvasive tuning if you really want that
speedup.
And then similarly at the workload level, I'm just going the skip it in the interest of time.
>>: [Indiscernible] are you exploiting the [indiscernible] AX [indiscernible]?
>> Shivnath Babu: No. Not now. But that will be something I guess -- the answer is no,
but here is in a one aspect which might be useful to address a question that Lee brought
up and this issue which is so now I introduce a whole bunch of different policies inside
Xplus, right? Can have these four experts, may not have many more experts or fewer
experts, which policies actually best.
So based on our experiments, in a tuning center you can think of two performance matrix.
One is what we call convergence which is -- now, I'm looking for a performance
improvement and the moment, how much time does it really pay to give me that
improvement as opposed to a time to completion which is give me the guarantee and
think of that as completion.
So the sort of short answer because I do want to use five minutes to talk about some of
the recent focus, short answer is you do exploitation when you're interested in some
quick improvements.
But if you are interested in getting to that final point, exploitation is a bad idea.
Exploitation literally ends up making small, small, small progress. [Indiscernible] give
some values maybe [indiscernible] that will actually suggest a [indiscernible] different
plan. I knew that. [Indiscernible] values. It takes a long time to, you know, cover that
space.
So in those settings, having something like that, you know, but if I know that I'm looking
for the best improvement, I can tell you what's a good policy for that.
There's some results which actually show that. Any questions so far? Given that Vic has
given me five more minutes, I want to press on please [indiscernible] to just illustrate
some of the things we are thinking about. The [indiscernible] is all good. We are building
whole bunch of tools, iTune and this other tool. And other things.
Okay. So three areas that we are actually focusing on a lot, first of all we are trying to
apply to more and more -- I won't say more and more, but some different systematic
applications which I guess [indiscernible] DB we really haven't focused on that maximum.
I've been talking to system administrators mostly, not purely database administrators,
although they do maintain databases at both -- at actually [indiscernible] at Duke and a
little bit with some data import teams at IBM. And now [indiscernible] actually talked to
them, they're not -- performance is not really the first thing on their mind. For whatever
reason. Right?
One thing that's really solidly on their mind is corruption. Their data getting corrupted
because there are many reasons, especially to this drive towards running things more on
commodity hardware. Commodity hardware is flakey. It can run Hadoop and these other
things and give you some numbers, but you really got to trust your data on it. There can
be bugs. The hardware is pretty flakey. There are [indiscernible] bugs that can happen.
And system administrators might make a mistake because date at that get corrupted, do
something that [indiscernible] trying to apply some methodology to that context.
By automatically learning these data and [indiscernible] checkers [indiscernible] checker,
database checker and so on. Getting permission from that and figuring out what other
checks to run.
Can I think snapshots to run these things? Has this actually become very, very easy or
very efficient because the source [indiscernible] whole bunch of focus on collecting
snapshots. We're not going to spend too much time on that unless there are questions.
The two other things I wanted to focus on where do you run these experiments? The
so-called harness for running experiments. So if this were to be a production database
system, with some clients running and, you know, we have tuned their system and
maybe it's multi-code, multi-disk and whatnot. But often administrators will not actually let
you do anything. They run experiments. Never. Right?
So that got us thinking about where would be a place to run these experiments?
So one [indiscernible] me, you know, you can run it on a stand by platform and
[indiscernible] stand by platforms. They keep because if the production, this primary
were to die, this guy has to take over. The harder configuration, least in some scenarios,
not all, scenarios that we've seen is very similar to what is that because if a [indiscernible]
actually gone. I've seen scenarios where there's three X investment in exact hardware
production, in [indiscernible] and different [indiscernible] always looks pretty much the
same as how that goes.
The most important thing, data is actually kept pretty much up to date. That's because
for experiments, we don't want to run into [indiscernible] copies of the data. Especially
for a parameter, for query-level tuning.
But this might be a resource. Which is actually turns out to be heavily under utilized.
Or on the cloud. Imagine your database might some day actually be running on the
cloud. So getting these [indiscernible] on the cloud, just taking a snapshot of your
[indiscernible] to run them and then cloning it, start pairing up ten different easy to load to
run these experiments is actually something which is very easily [indiscernible], just a few
lines of easy to -- APS cryptic can actually do that.
So what we have done is we have sort of gathered in some sense a -- I like to think of it
as bonus the interface and the run time system which the sort of DB or the user who
ordered is five -- this is language for [indiscernible] policies. So they can designate
resources that experiments can be run. And they can designate each of these resources,
conditions under which resources can be used for experiments.
One of the first things we did is actually implement a policy like this in a context of the
stand by system is being used for DBSS. Look, I can let you run experiments on my
stand by under the condition [indiscernible] utilization on the stand by looks [indiscernible]
value. And [indiscernible] one importing we have to guarantee us if the stand by -- the
primary were to fail, the fact that we are running shouldn't increase the recovery time by
anything [indiscernible] value. And what I have in this slide is actually a quick animation
of that aspect. I'm already looking at the clock and it's 2:35. I just have this slide and
another slide. Should I quickly [indiscernible] go to the summary or it can take two more
minutes?
>>: You can finish this.
>> Shivnath Babu: Okay. So essentially this architecture that we have implemented
using some technology on [indiscernible] although utilization is with a we need. So
imagine you have a production. You are actually doing log shipping and the stand by is
actually standing there.
So [indiscernible] stand by will be doing is getting the logs, applying and being ready to
take over when the primary actually dies. Right?
So what we can do when our system kicks in and identifies an experiment is a plan,
maybe it's [indiscernible] differently configuration. [Indiscernible] do an experiment in
which case we use zones feature [indiscernible] to do this. You sort of take up what we
call the home. It is the real use of that system. Cut it down and essentially in as few as 5
to 10 percent of resources on that machine is enough to take the logs and the good stuff.
So that is kind of going on so we don't stop the stand by from doing it's actual
functionality and we carve the remaining results into what we call the garage. The
garage container.
This is very [indiscernible]. And most importantly, we use in the ZSF file system that
comes through [indiscernible] the ability to do copyrights so we don't create an entire
copy of the data.
So this guy is running express, this guy is keeping up to date, and only the blocks that
change get updated. So literally they're sharing all the data. Of course one thing we
have to be very careful about is that, look, now maybe this is going to put a large I/O
overhead that can affect what you see in here. Okay. So there are some [indiscernible]
issues that have to be sort of taken into account and that sort of happens when we get
the result and then [indiscernible] monitoring and [indiscernible] really has low cost
monitoring. And in fact strips this noise out.
And once we have this harness that we can run experiments, now what do you think is -why not actually have a decorative language where the admin or the user can sort of
specify the experiments, the type of experiments they want to run or more importantly
their objectives. And we're putting this -- and implementing the planning part.
Remember I showed you the planning the ducting and the iteration will be that -- that part
will be generated by our system automatically. We call this dart X. And looking at two
[indiscernible] for this. The first one is the one I kind of showed you which is this
benchmarking and the tuning aspect really focusing on mapping out the response
[indiscernible] entirely, or finding a good region. So this domain and [indiscernible] data
corruption domain. So very likely probably in the next couple of months we'll actually
have something to announce over here where this whole corruption, the type of test, and
running this automatically on the cloud is -- to be able to actually have a tool
[indiscernible].
So that's it. I think in the quick summary, the need for experiments and automating
[indiscernible] has always been there. I think now the infrastructure, especially of the
cloud ready to kind of carve [indiscernible], that has made some of this possible. We
have built a number of tools around this paradigm, and there are interesting
[indiscernible] questions to think about. Should for instance [indiscernible] through the
system support [indiscernible] as a first class citizen, well then I will [indiscernible] coming
and doing this. So thank you.
[Applause.]
Any other questions?
>>: [Indiscernible] cardinality as the [indiscernible]. [Indiscernible] problem, difference
are going to [indiscernible] cardinality create similar [indiscernible] if that's their chronic
problem, right? If you were to combine the two, is there anything [indiscernible] that you
can do to basically [indiscernible] cardinality [indiscernible] problem manifesting through
the expressions.
>> Shivnath Babu: So my answer is the -- so for us, this world, the world is sort of black
and white between estimated cardinalities and accurate cardinalities. We don't
[indiscernible] deeper and actually at the level of why didn't the optimizer [indiscernible]
incorrect cardinality estimate? Was it because of correlation? Was it because of
[indiscernible] assumption? Can we do some sampling to actually get a value?
So essentially, [indiscernible] actually, yes, they have actually done a whole bunch of
work on using sampling and they're coming up with this [indiscernible] and things like
that. I've actually done some work myself.
So how am I feeling? Great. Great research. But building a tool around that has taken
us for whatever reason it's been very hard. So now why not just make them very simple.
[Indiscernible] black and white. [Indiscernible] accurate. I don't even try to maybe
[indiscernible] pretty accurate. Maybe it was actually based on a sample of something,
right? Very accurate.
So that is something we have not gone into. Maybe it would be interesting to act, but in
some sense, that aspect of getting it, why the cardinality of values of wrong [indiscernible]
and incorporating that somehow into the [indiscernible], the physical plan space and what
we can -- once we have the plan, what information that we can take from that and give it
to other plans in terms of estimating debt cost better. That's something which frankly I
haven't thought about.
>>: [Indiscernible] assumption specifically [indiscernible] as it effects cardinality you're
right. [Indiscernible] used in your [indiscernible], you deduct [indiscernible] discern that
properly.
>> Shivnath Babu: Do you mean implement in the sense that we are summing up the
cost across different operators rather than operators running together and things like
that?
>>: Right. That's what -- [indiscernible] that would have a different [indiscernible]
cardinality. And how would you combine them? That's the question. Do you have any
[indiscernible] on that?
>> Shivnath Babu: Not really, but maybe next time we have in mind better.
>>: Well [indiscernible] ->>: [Indiscernible]. Any other [indiscernible]? Thanks.
[Applause.]
Download