1、Experiment-driven System Management,Shivnath Babu Duke UniversityJoint work with Songyun Duan, Herodotos Herodotou, and Vamsidhar Thummala,Managing DBs in Small to Medium Business Enterprises (SMBs),Peter is a system admin in an SMB Manages the database (DB) SMB cannot afford a DBA Suppose Peter has
2、 to tune a poorly-performing DB Design advisor may not help Maybe the problem is with DB configuration parameters,Database (DB),Tuning DB Configuration Parameters,Parameters that control Memory distribution I/O optimization Parallelism Optimizers cost model Number of parameters 100 15-25 critical pa
3、rams depending on OLAP Vs. OLTP Few holistic parameter tuning tools available Peter may have to resort to 1000+ page tuning manuals or rules of thumb from experts Can be a frustrating experience,Response Surfaces,TPC-H 4 GB DB size, 1 GB memory, Query 18,2-dim Projection of a 11-dim Surface,DBAs App
4、roach to Parameter Tuning,DBAs run experiments Here, an experiment is a run of the DB workload with a specific parameter configuration Common strategy: vary one DB parameter at a time,Experiment-driven Management,Are more experiments needed?,Process output to extract information,Plan next set of exp
5、eriments,Conduct experiments on workbench,Yes,Mgmt. task,Result,Goal: Automate this process,Roadmap,Use cases of experiment-driven mgmt. Query tuning, benchmarking, Hadoop, testing, iTuned: Tool for DB conf parameter tuning End-to-end application of experiment-driven mgmt. .eX: Language and run-time
6、 system that brings experiment-driven mgmt. to users & tuning tools,What is an Experiment?,Depends on the management task Pay some extra cost, get new information in return Even for a specific management task, there can be spectrum of possible experiments,Uses of Experiment-driven Mgmt.,DB conf para
7、meter tuning,Uses of Experiment-driven Mgmt.,DB conf parameter tuning MapReduce job tuning in Hadoop,Uses of Experiment-driven Mgmt.,DB conf parameter tuning MapReduce job tuning in Hadoop Server benchmarking Capacity planning Cost/perf modeling,Uses of Experiment-driven Mgmt.,Tuning “problem querie
8、s”,Uses of Experiment-driven Mgmt.,Tuning “problem queries”,Uses of Experiment-driven Mgmt.,Tuning “problem queries” Troubleshooting Testing Canary in the server farm (James Hamilton, Amazon) ,DB conf parameter tuning MapReduce job tuning in Hadoop Server benchmarking Capacity planning Cost/perf mod
9、eling,Roadmap,Use cases of experiment-driven mgmt. Query tuning, benchmarking, Hadoop, testing, iTuned: Tool for DB conf parameter tuning End-to-end application of experiment-driven mgmt. .eX: Language and run-time system that brings experiment-driven mgmt. to users & tuning tools,Problem Abstractio
10、n,Unknown response surface: y = F(X) X = Parameters x1, x2, , xm Each experiment gives a sample Set DB to conf Xi Run workload that needs tuning Measure performance yi at Xi Goal: Find high performance setting with low total cost of running experiments,Example,Goal: Compute the potential utility of
11、candidate experiments,Where to do the next experiment?,/ Phase I: Bootstrapping Conduct some initial experiments / Phase II: Sequential Sampling Loop: Until stopping condition is reached Identify candidate experiments to do next Based on current samples, estimate the utility of each candidate experi
12、ment Conduct the next experiment at the candidate with highest utility,iTuneds Adaptive Sampling Algorithm for Experiment Planning,Utility of an Experiment,Let - be the samples from n experiments done so far Let be the best setting so far (i.e., y* = mini yi) wlg assuming minimization U(X), Utility
13、of experiment at X is / y = F(X)y* - y if y* y 0 otherwise However, U(X) poses a chicken-and-egg problem y will be known only after experiment is run at X Goal: Compute expected utility EU(X),Expected Utility of an Experiment,Suppose we have the probability density function of y (y is the perf at X)
14、 Prob(y = v | for i=1,n) Then, EU(X) = sv=-1 U(X) Prob(y = v) dvEU(X) = sv=-1 (y* - v) Prob(y = v) dv Goal: Compute Prob(y = v | for i=1,n),v=+1,v=y*,GRS models the response surface as: y(X) = g(X) + Z(X) (+ (X) for measurement error) E.g., g(X) = x1 2x2 + 0.1x12 (Learned using common techniques) Z:
15、 Gaussian Process to capture regression residual,Model: Gaussian Process Representation (GRS) of a Response Surface,Primer on Gaussian Process,Univariate Gaussian distribution G = N(,) Described by mean , variance Multivariate Gaussian distribution G1, G2, , Gn Described by mean vector and covarianc
16、e matrix Gaussian Process Generalizes multivariate Gaussian to arbitrary number of dimensions Described by mean and covariance functions,If Z is a Gaussian process, then: Z(X1),Z(Xn),Z(X) is multivariate Gaussian Z(X) | Z(X1),Z(Xn) is a univariate Gaussian y(X) is a univariate Gaussian,GRS captures
17、the response surface as: y(X) = g(X) + Z(X) (+ (X) for measurement error),Model: Gaussian Process Representation (GRS) of a Response Surface,Parameters of the GRS Model,Z(X1),Z(Xn) is multivariate Gaussian Z(Xi) has zero mean Covariance(Z(Xi),Z(Xj) / exp(k k |xik xjk|k) Residuals at nearby points ha
18、ve higher correlation k, k learned from -,Use of the GRS Model,Recall our goals to compute EU(X) = sv=-1 (y* - v) Prob(y = v) dv Prob(y = v | for i=1,n) Lemma: Using the GRS, we can compute the mean (X) and variance 2(X) of the Gaussian y(X) Theorem: EU(X) has a closed form that is a product of: Ter
19、m that depends on (y* - (X) Term that depends on (X) It follows that settings X with high EU are either: Close to known good settings (for exploitation) In highly uncertain regions (for exploration),v=y*,Example,Settings X with high EU are either: Close to known good settings (high y*-(X) In highly
20、uncertain regions (high (X),EU(X),y*,Unknown actual surface,(X),4(X),Test Data,Where to Conduct Experiments?,Production Platform,Standby Platform,DBMS,Test Platform,Clients,Clients,Clients,Write Ahead Log (WAL) shipping,Middle Tier,iTuneds Solution,Exploit underutilized resources with minimal impact
21、 on production workload DBA/User designates resources where experiments can be run E.g., production/standby/test DBA/User specifies policies that dictate when experiments can be run Separate regular use (home) from experiments (garage) Example: If CPU, mem, & disk utilization 10% for past 15 mins, t
22、hen resource can be used for experiments,One Implementation of Home/Garage,Standby Machine,Production Platform,Data,WAL shipping,Middle Tier,Interface,Engine,iTuned,Experiment Planner & Scheduler,Copy on Write,Overheads are Low,Empirical Evaluation (1),Cluster of machines with 2GHz processors and 3G
23、B memory Two database systems: PostgreSQL & MySQL Various workloads OLAP: Mixes of heavy-weight TPC-H queries Varying #queries, #query_types, and MPL Scale factors 1 and 10 OLTP: TPC-W and RUBiS Tuning of up to 30 configuration parameters,Techniques compared Default parameter settings shipped (D) Ma
24、nual rule-based tuning (M) Smart Hill Climbing (S): State-of-the-art technique Brute-Force search (B): Run many experiments to find approximation to optimal setting iTuned (I) Evaluation metrics Quality: workload running time after tuning Efficiency: time needed for tuning,Empirical Evaluation (2),C
25、omparison of Tuning Quality,iTuneds Scalability Features (1),Identify important parameters quickly Run experiments in parallel Stop low-utility experiments early Compress the workload Work in progress: Apply database-specific knowledge Incremental tuning Interactive tuning,iTuneds Scalability Featur
26、es (2),Identify important parameters quickly Using sensitivity analysis with a few experiments,#Parameters = 9, #Experiments = 10,iTuneds Scalability Features (3),Roadmap,Use cases of experiment-driven mgmt. Query tuning, benchmarking, Hadoop, testing, iTuned: Tool for DB conf parameter tuning End-t
27、o-end application of experiment-driven mgmt. .eX: Language and run-time system that brings experiment-driven mgmt. to users & tuning tools,Back of the Envelope Calculation,DBAs cost $300/day; Consultants cost $100/hr 1 Day of experiments gives a wealth of info. TPC-H, TPC-W, RUBiS workloads; 10-30 c
28、onf. params,Cost of running these experiments for 1 day on Amazon Web Serv. Server: $10/day Storage: $0.4/day I/O: $5/day TOTAL: $15/day,.eX: Power of Experiments to the People,Users & tools express needs as scripts in eXL (eXperiment Language) .eX engine plans and conducts experiments on designated
29、 resources Intuitive visualization of results,Resources,eXL script,Language processor,.eX,Current Focus of .eX,Parts of an eXL script Query: (approx.) response surface mapping, search Expt. setup & monitoring Constraints & optimization: resources, cost, time,Automatically generate the experiment-dri
30、ven workflow,Summary,Automated expt-driven mgmt: The time has come Need, infrastructure, & promise are all there We have built many tools around this paradigm http:/www.cs.duke.edu/shivnath/dotex.html Poses interesting questions and challenges Make it easy for users/admins to do expts Make experiments first-class citizens in systems,