LINQ Slides from Gang Luo, Xuting Zhao and Damien Guard What is LINQ? Language Integrated Query Make query a part of the language Component of .NET Framework 3.5 Query without LINQ Objects using loops and conditions foreach (Customer c in customers) if (c.Region == "UK") ... Databases using SQL SELECT * FROM Customers WHERE Region='UK' XML using XPath/XQuery //Customers/Customer[@Region='UK'] ADO without LINQ SqlConnection con = new SqlConnection(...); con.Open(); SqlCommand cmd = new SqlCommand( @"SELECT * FROM Customers WHERE c.Region = @Region", con ); cmd.Parameters.AddWithValue("@Region", "UK"); DataReader dr = cmd.ExecuteReader(); while (dr.Read()) { string name = dr.GetString(dr.GetOrdinal("Name")); string phone = dr.GetString(dr.GetOrdinal("Phone")); DateTime date = dr.GetDateTime(3); } dr.Close(); con.Close(); Query with LINQ C# var myCustomers = from c in customers where c.Region == "UK" select c; More LINQ queries var goodCusts = (from c in db.Customers where c.PostCode.StartsWith("GY") orderby c.Sales descending select c).Skip(10).Take(10); Local variable type inference Compiler can infer types from initializer assignments var keyword indicates compiler inferred type Still strong-typed This is not like JavaScript var var a = 10; // Simple types var x = new { Blog = “attardi”, Created = DateTime.Now }; // Anonymous types Essential when using anonymous types Anonymous Types Object of new type generated on the fly without first defining it. Useful for projection to select one or more fields of another structure. The type will be dynamically generated with setters and getters to corresponding members. Some common methods are also provided. No other methods will be added to this type. But that is already enough! The object is created and initialized by Anonymous Object Initializer. Advantages Unified data access Single syntax to learn and remember Strongly typed Catch errors during compilation IntelliSense Prompt for syntax and attributes Bindable result sets Architecture LINQ to Objects int[] nums = new int[] {0,4,2,6,3,8,3,1}; double average = nums.Take(6).Average(); var above = from n in nums where n > average select n; LINQ to Objects Query any IEnumerable<T> source Includes arrays, List<T>, Dictionary... Many useful operators available Sum, Max, Min, Distinct, Intersect, Union Expose your own data with IEnumerable<T> or IQueryable<T> Create operators using extension methods LINQ operators Aggregate Conversion Aggregate Average Count Max Min Sum Ordering Partitioning Cast OrderBy OfType ThenBy ToArray Descending ToDictionary Reverse ToList ToLookup ToSequence and others … Skip SkipWhile Take TakeWhile Sets Concat Distinct Except Intersect Union Query Expression SQL-like: from s in names where s.Length == 5 orderby select s.ToUpper(); OO-style: names.Where(s => s.Length==5) .OrderBy(s => s) .Select(s => s.ToUpper()); Where, OrderBy, and Select are operators. The arguments to these operators are Lambda Expression. Lambda Expressions Examples: s => s.Length == 5 Executable function Anonymous functional. Can be assigned to a delegate variable. No need to indicate the types Can be passed to methods as parameters. Expression Tree Efficient in-memory data representations of lambda expressions Changing the behaviors of the expressions Applying your own optimization Function Types Func<int, bool> is a shorthand for public delegate bool Func(int a0); // Initialized with anonymous method Func<int, bool> even= delegate (int x) { return x % 2 == 0; }; // Initialized with lambda expression Func<int, bool> even2 = x => x % 2 == 0; Methods Extension Control not only by Lambda Expression, but also by methods extension public static class Enumerable { public static IEnumerable<T> Where<T>( this IEnumerable<T> source, Func<T, bool> predicate) { foreach (T item in source) if (predicate(item)) yield return item; } } LINQ Operations Join When there is relationship (e.g. foreign key) between two tables, no explicit join operation is needed Using dot notation to access the relationship properties, and navigate all the matching objects. var q = from o in db.Orders where o.Customer.City == “London” select o; To join any two data sources on any attribute, you need an explicit join operation. var query = names.Join(people, n => n, p => p.Name, (n, p) => p); The lambda expression for shaping (n, p) => p will be applied on each matching pairs. LINQ Operations (cont.) Group Join The lambda expression for shaping is applied on the outer element and the set of all the inner elements that matches the outer one. Shape the result at a set level var query = names.GroupJoin(people, n => n, p => p.Name, (n, matching) => new { Name = n, Count = matching.Count() } ) LINQ Operations (cont.) Select Many Each object in the result set may contain a collection or array Select many help decompose the structure and flatten the result var query = names.SelectMany(n => people.Where(p => n == p.Name)) All the elements could be traversed in one foreach loop. Aggregation Standard aggregation operators: Min, Max, Sum, Average. int totalLength=names.Sum(s => s.Length); General purpose (generic) operator: static U Aggregate<T, U>(this IEnumerable<T> source, U seed, Func<U, T, U> func) LINQ Deferred A LINQ data source can actually implement one of two interfaces: IEnumerable<T> IQueryable<T> public interface IQueryable<T> : IEnumerable<T>, IQueryable, IEnumerable { IQueryable<S> CreateQuery<S>(Expression exp); S Execute<S>(Expression exp); } Create deferred query execution plan IQueryable IQueryable<T> interface will defer the evaluation of the query. An expression tree will represent all the deferred queries as a whole. Several operations could be “merged”, only one SQL query will be generated and sent to database. Multi-level defer Lambda Expressions Revisited Lambda expressionscan represent either IL code or data Expression<T> makes the difference Func<int, bool> lambdaIL = n => n % 2 == 0; Expression<Func<int, bool>> lambdaTree = n => n % 2 == 0; Compiler handles Expression<T> types differently Emits code to generate expression tree instead of usual IL for delegate Expression Trees Expression tree are hierarchical trees of instructions that compose an expression Add value of expression trees Actual creating of IL is deferred until execution of query Implementation of IL creation can vary Trees can even be remoted for parallel processing Creating IL from Expression Trees Right before execution tree is compiled into IL Expression<Func<Posting,bool>> predicate = p => p.Posted < DateTime.Now.AddDays(-5); Func<Posting,bool> d = predicate.Compile(); Implementation of IL generation differs very much for each Linq flavor. Linq to SQL generates IL that runs SQL commands Linq to Objects builds IL with Sequence extensions methods Nested defer Nested defer var q = from c in db.Customers where c.City == “London” select new { c.ContactName, c.Phone } var q2 = from c in q.AsEnumerable() select new { Name = DoNameProcessing(c.ContactName), Phone = DoPhoneProcessing(C.Phone) }; What if you want the intermediate result? string lastName = “Simpson” var persons = from p in personList where p.LastName = lastName select p; lastName = “Flanders” foreach (Person p in persons) Console.WriteLine(“{0} {1}”, p.FirstName, p.LastName); Deferred Execution Advantages Performance! Query dependency! Disadvantages Divide one query into multiple ones If you iterate over the result set 100 times, the query will be executed 100 times. Users have to be very careful LINQ to SQL Data Model [Table(Name=“Customers”)] public class Customer { [Column(Id=true)] public string CustomerID; … private EntitySet<Order> _Orders; [Association(Storage=“_Orders”, OtherKey=“CustomerID”)] public EntitySet<Order> Orders { get { return _Orders; } set { _Orders.Assign(value); } } } LINQ to SQL helps connect to relational and manipulate the relational data as objects in memory. It achieves this by translating the operations into SQL statements. LINQ to SQL Object-relational mapping Records become strongly-typed objects Data context is the controller mechanism Facilitates update, delete & insert Translates LINQ queries behind the scenes Type, parameter and injection safe Database mapping Map tables & fields to classes & properties Generates partial classes with attributes Each record becomes an object Data context represents the database Utilize tables, views or stored procedures Modifying objects Update Set object properties Delete context.Table.DeleteOnSubmit(object) Insert context.Table.InsertOnSubmit(object) Commit changes back context.SubmitChanges() Transactional - all or nothing Consistency Every object will be tracked by LINQ the moment it is loaded from database. The tracking mechanism monitor the manipulation on relationship properties. Once you modify one side of the relationship, LINQ will modify the other to keep it consistent. When an object is deleted, it could still exist in memory, but it will not cause inconsistency. Concurrency Optimistic concurrency Conflict checking when SubmitChanges() is called By default, transaction will abort and an exception will be thrown when a conflict is detected. User can handle the conflict in the exception catch block. User can set whether or not to detect the conflict when one column get updated. Transaction/Update When update, first check whether new object is added (by tracking mechanism) if yes, insert statement will be generated. What does Django do here? Modification will not hit the database until the SubmitChanges() method is called All operations will be translated into SQL statements All modifications will be encapsulated into a transaction. Transaction/Update (cont.) If an exception is throw during the update, all the changes will be rolled back One SubmitChanges() is actually one transaction. (pros and cons?) Users can also explicitly indicate a new transaction scope. LINQ to XML Class Hierarchy http://msdn.microsoft.com/en-us/library/bb308960.aspx LINQ to XML LINQ to XML var query = from p in people where p.CanCode select new XElement(“Person”, new XAttribute(“Age”, p.Age), p.Name); XML to LINQ var x = new XElement(“People”, from p in people where p.CanCode select new XElement(“Person”, new XAttribute(“Age”, p.Age), p.Name); Performance LINQ has more control and efficiency in O/R Mapping than NHibernate LINQ: Externl Mapping or Attribute Mapping NHibernate: Externl Mapping Because of mapping, LINQ is slower than database tools such as SqlDataReader or SqlDataAdapter In large dataset, their performance are more and more similar