SUMMARY
This step-by-step article describes how to implement and
how to use a
DataSetHelper class that includes sample code to create a
DataTable object that contains aggregate data from another
DataTable object.
To do this, use the following public
methods:
- CreateGroupByTable
- InsertGroupByInto
- SelectGroupByInto
The
SelectGroupByInto method calls the
CreateGroupByTable and the
InsertGroupByInto methods. You can also use private helper methods and data members
to store the parsed field list, and to handle NULL values correctly when the
aggregate values are calculated.
The
DataSetHelper class includes a
DataSet member variable. Optionally, you can assign an existing
DataSet. If the member variable points to a valid
DataSet, any
DataTable objects that the
CreateGroupByTable or the
SelectGroupByInto method creates are added to the
DataSet. In either case, the method call returns a reference to the
DataTable object. The
InsertGroupByInto method requires an existing target
DataTable, and does not return anything.
For additional information about DataSet objects, click the following article number to view the article in the Microsoft Knowledge Base:
313485
INFO: Roadmap for ADO.NET DataSet, DataView, and DataViewManager
back to
the topRequirements
The following list outlines the recommended hardware, software,
network infrastructure, and service packs that you need:
- Microsoft Windows XP, Windows 2000, or Windows NT 4.0
Service Pack 6a
- Microsoft Visual Studio .NET
This article assumes that you are familiar with the following
topics:
- Visual C# .NET syntax
- ADO.NET fundamentals and syntax
back to
the topDataSetHelper Shell Class
The code in this section declares the shell class that all
DataSetHelper articles add methods and member variables to.
- Start Visual Studio .NET.
- On the File menu, point to New, and then click Project.
- In the New Project dialog box, click Visual C# Projects under Project Types, and then click Class Library under Templates.
- In the Name text box, type DataSetHelper. Click
OK.
- In the code, at the top of the code window, add the
following line:
using System.Data;
- Replace the class code with the following code:
public class DataSetHelper
{
public DataSet ds;
public DataSetHelper(ref DataSet DataSet)
{
ds = DataSet;
}
public DataSetHelper()
{
ds = null;
}
}
The two overloads for the constructor allow you to create an instance
of the class with or without a reference to a valid DataSet. For a class that contains a reference to a valid DataSet, the DataTable objects that the methods return are also added automatically to the DataSet.
back to
the topField List Parsers
This section contains the code for two field list parsers: one for the field list, and one for the fields to group by. The parsed
structure is used so that the
CreateGroupByTable and
InsertGroupByInto methods do not have to reparse the field list. These methods must reparse the field list if they are called from the
SelectGroupByInto method or from your own code. The parsed field list and the unparsed field list
are stored in
Private member variables of the
DataSetHelper class.
- Add the following Private class definition in the DataSetHelper class that you created in the "DataSetHelper Shell Class" section:
private class FieldInfo
{
public string RelationName;
public string FieldName; //source table field name
public string FieldAlias; //destination table field name
public string Aggregate;
}
Note This class is common to other DataSetHelper articles, and contains some fields that this article does not use.
- Add the following Private member variables to the class definition immediately after the DataSet declaration:
private System.Collections.ArrayList m_FieldInfo; private string m_FieldList;
private System.Collections.ArrayList GroupByFieldInfo; private string GroupByFieldList;
- Add the following Private method to the DataSetHelper class definition. This method is the same as the method that is used in other DataSetHelper articles and is used to parse the list of fields to group by.
private void ParseFieldList(string FieldList, bool AllowRelation)
{
/*
* This code parses FieldList into FieldInfo objects and then
* adds them to the m_FieldInfo private member
*
* FieldList systax: [relationname.]fieldname[ alias], ...
*/
if (m_FieldList == FieldList) return;
m_FieldInfo = new System.Collections.ArrayList();
m_FieldList = FieldList;
FieldInfo Field; string[] FieldParts; string[] Fields=FieldList.Split(',');
int i;
for (i=0; i<=Fields.Length-1; i++)
{
Field=new FieldInfo();
//parse FieldAlias
FieldParts = Fields[i].Trim().Split(' ');
switch (FieldParts.Length)
{
case 1:
//to be set at the end of the loop
break;
case 2:
Field.FieldAlias=FieldParts[1];
break;
default:
throw new Exception("Too many spaces in field definition: '" + Fields[i] + "'.");
}
//parse FieldName and RelationName
FieldParts = FieldParts[0].Split('.');
switch (FieldParts.Length)
{
case 1:
Field.FieldName=FieldParts[0];
break;
case 2:
if (AllowRelation==false)
throw new Exception("Relation specifiers not permitted in field list: '" + Fields[i] + "'.");
Field.RelationName = FieldParts[0].Trim();
Field.FieldName=FieldParts[1].Trim();
break;
default:
throw new Exception("Invalid field definition: " + Fields[i] + "'.");
}
if (Field.FieldAlias==null)
Field.FieldAlias = Field.FieldName;
m_FieldInfo.Add (Field);
}
}
- Add the following Private method to the class definition. This method is used to parse the field
list, including aggregate functions.
private void ParseGroupByFieldList(string FieldList)
{
/*
* Parses FieldList into FieldInfo objects and adds them to the GroupByFieldInfo private member
*
* FieldList syntax: fieldname[ alias]|operatorname(fieldname)[ alias],...
*
* Supported Operators: count,sum,max,min,first,last
*/
if (GroupByFieldList == FieldList) return;
GroupByFieldInfo = new System.Collections.ArrayList();
FieldInfo Field; string[] FieldParts; string[] Fields = FieldList.Split(',');
for (int i=0; i<=Fields.Length-1;i++)
{
Field = new FieldInfo();
//Parse FieldAlias
FieldParts = Fields[i].Trim().Split(' ');
switch (FieldParts.Length)
{
case 1:
//to be set at the end of the loop
break;
case 2:
Field.FieldAlias = FieldParts[1];
break;
default:
throw new ArgumentException("Too many spaces in field definition: '" + Fields[i] + "'.");
}
//Parse FieldName and Aggregate
FieldParts = FieldParts[0].Split('(');
switch (FieldParts.Length)
{
case 1:
Field.FieldName = FieldParts[0];
break;
case 2:
Field.Aggregate = FieldParts[0].Trim().ToLower(); //we're doing a case-sensitive comparison later
Field.FieldName = FieldParts[1].Trim(' ', ')');
break;
default:
throw new ArgumentException("Invalid field definition: '" + Fields[i] + "'.");
}
if (Field.FieldAlias==null)
{
if (Field.Aggregate==null)
Field.FieldAlias=Field.FieldName;
else
Field.FieldAlias = Field.Aggregate + "of" + Field.FieldName;
}
GroupByFieldInfo.Add(Field);
}
GroupByFieldList = FieldList;
}
back to
the topCreateGroupByTable Method
This section contains the code for the
CreateGroupByTable method.
The following is the calling convention for the
CreateGroupByTable method:
dt = dsHelper.CreateGroupByTable("OrderSummary", ds.Tables["Orders"], "EmployeeID,sum(Amount) Total,min(Amount) Min,max(Amount) Max");
This call sample creates a new
DataTable with a
TableName of OrderSummary and four fields (EmployeeID, Total, Min, and
Max). The four fields have the same data type as the EmployeeID and the Amount
fields in the Orders table.
Use the following syntax to specify
fields in the field list:
fieldname[ alias]|aggregatefunction(fieldname)[ alias], ...
Note the following for this syntax:
- The ColumnName and the DataType properties are the only properties that are copied to the
destination DataTable.
- You can rename a field in the destination DataTable by specifying an alias name.
- The field list can contain a subset of field names that are
listed in a different order than in the source DataTable. If the field list is blank, an exception is thrown.
- Relation specifiers are not supported as part of the field
name. All fields must come from the same DataTable.
To call the
CreateGroupByTable method, add the following method to the
DataSetHelper class that you created in the "
DataSetHelper Shell Class" section:
public DataTable CreateGroupByTable(string TableName, DataTable SourceTable, string FieldList)
{
/*
* Creates a table based on aggregates of fields of another table
*
* RowFilter affects rows before GroupBy operation. No "Having" support
* though this can be emulated by subsequent filtering of the table that results
*
* FieldList syntax: fieldname[ alias]|aggregatefunction(fieldname)[ alias], ...
*/
if (FieldList == null)
{
throw new ArgumentException("You must specify at least one field in the field list.");
//return CreateTable(TableName, SourceTable);
}
else
{
DataTable dt = new DataTable(TableName);
ParseGroupByFieldList(FieldList);
foreach (FieldInfo Field in GroupByFieldInfo)
{
DataColumn dc = SourceTable.Columns[Field.FieldName];
if (Field.Aggregate==null)
dt.Columns.Add(Field.FieldAlias, dc.DataType, dc.Expression);
else
dt.Columns.Add(Field.FieldAlias, dc.DataType);
}
if (ds != null)
ds.Tables.Add(dt);
return dt;
}
}
back to
the topInsertGroupByInto Method
This section contains code for the
InsertGroupByInto method.
The results are sorted on the fields that are listed in the
GroupBy argument. The
GroupBy argument must comply with a valid Sort field list (minus
ASC and DESC modifiers). If the
GroupBy argument is blank, the target
DataTable contains only a single record that aggregates all the input.
When you call the
ParseGroupByFieldList and the
ParseFieldList properties, you can parse lists that were previously parsed, if
these lists are available. If the field list is blank, an exception is thrown.
This is the calling convention for the
InsertGroupByInto method:
dsHelper.InsertGroupByInto(ds.Tables["OrderSummary"], ds.Tables["Orders"],
"EmployeeID,sum(Amount) Total,min(Amount) Min,max(Amount) Max", "EmployeeID<5", "EmployeeID");
This call sample reads records from the
DataTable that is named Orders and writes records to the
DataTable that is named OrderSummary. The OrderSummary
DataTable contains the EmployeeID field and three different aggregates of
the Amount field that are filtered on "EmployeeID<5" and that are grouped on
(and sorted by) EmployeeID.
Note The filter expression is applied before any aggregate
functionality. To implement HAVING-type functionality, filter the resultant
DataTable.
To call the
InsertGroupByInto method, add the following method to the
DataSetHelper class that you created in the "
DataSetHelper Shell Class" section:
public void InsertGroupByInto(DataTable DestTable, DataTable SourceTable, string FieldList,
string RowFilter, string GroupBy)
{
/*
* Copies the selected rows and columns from SourceTable and inserts them into DestTable
* FieldList has same format as CreateGroupByTable
*/
if (FieldList == null)
throw new ArgumentException("You must specify at least one field in the field list.");
ParseGroupByFieldList(FieldList); //parse field list
ParseFieldList(GroupBy,false); //parse field names to Group By into an arraylist
DataRow[] Rows = SourceTable.Select(RowFilter, GroupBy);
DataRow LastSourceRow = null, DestRow = null; bool SameRow; int RowCount=0;
foreach(DataRow SourceRow in Rows)
{
SameRow=false;
if (LastSourceRow!=null)
{
SameRow=true;
foreach(FieldInfo Field in m_FieldInfo)
{
if (!ColumnEqual(LastSourceRow[Field.FieldName], SourceRow[Field.FieldName]))
{
SameRow=false;
break;
}
}
if (!SameRow)
DestTable.Rows.Add(DestRow);
}
if (!SameRow)
{
DestRow = DestTable.NewRow();
RowCount=0;
}
RowCount+=1;
foreach(FieldInfo Field in GroupByFieldInfo)
{
switch(Field.Aggregate) //this test is case-sensitive
{
case null: //implicit last
case "": //implicit last
case "last":
DestRow[Field.FieldAlias]=SourceRow[Field.FieldName];
break;
case "first":
if (RowCount==1)
DestRow[Field.FieldAlias]=SourceRow[Field.FieldName];
break;
case "count":
DestRow[Field.FieldAlias]=RowCount;
break;
case "sum":
DestRow[Field.FieldAlias]=Add(DestRow[Field.FieldAlias], SourceRow[Field.FieldName]);
break;
case "max":
DestRow[Field.FieldAlias]=Max(DestRow[Field.FieldAlias], SourceRow[Field.FieldName]);
break;
case "min":
if (RowCount==1)
DestRow[Field.FieldAlias]=SourceRow[Field.FieldName];
else
DestRow[Field.FieldAlias]=Min(DestRow[Field.FieldAlias], SourceRow[Field.FieldName]);
break;
}
}
LastSourceRow = SourceRow;
}
if(DestRow!=null)
DestTable.Rows.Add(DestRow);
}
private FieldInfo LocateFieldInfoByName(System.Collections.ArrayList FieldList, string Name)
{
//Looks up a FieldInfo record based on FieldName
foreach (FieldInfo Field in FieldList)
{
if (Field.FieldName==Name)
return Field;
}
return null;
}
private bool ColumnEqual(object a, object b)
{
/*
* Compares two values to see if they are equal. Also compares DBNULL.Value.
*
* Note: If your DataTable contains object fields, you must extend this
* function to handle them in a meaningful way if you intend to group on them.
*/
if ((a is DBNull) && (b is DBNull))
return true; //both are null
if ((a is DBNull) || (b is DBNull))
return false; //only one is null
return (a==b); //value type standard comparison
}
private object Min(object a, object b)
{
//Returns MIN of two values - DBNull is less than all others
if ((a is DBNull) || (b is DBNull))
return DBNull.Value;
if (((IComparable)a).CompareTo(b)==-1)
return a;
else
return b;
}
private object Max(object a, object b)
{
//Returns Max of two values - DBNull is less than all others
if (a is DBNull)
return b;
if (b is DBNull)
return a;
if (((IComparable)a).CompareTo(b)==1)
return a;
else
return b;
}
private object Add(object a, object b)
{
//Adds two values - if one is DBNull, then returns the other
if (a is DBNull)
return b;
if (b is DBNull)
return a;
return ((decimal)a + (decimal)b);
}
back to
the topSelectGroupByInto Method
This section contains the code for the
SelectGroupByInto method. This method is a combination of the
CreateGroupByTable and the
InsertGroupByInto methods. The
SelectGroupByInto method creates a new
DataTable based on existing
DataTable objects, and copies the records that are sorted and filtered to
the new
DataTable.
The following is the calling convention for the
SelectGroupByInto method:
dt = dsHelper.SelectGroupByInto("OrderSummary", ds.Tables["Employees"],
"EmployeeID,sum(Amount) Total,min(Amount) Min,max(Amount) Max", "EmployeeID<5", "EmployeeID");
This call sample creates a new
DataTable with a
TableName of OrderSummary and four fields (EmployeeID, Total, Min, and
Max). These four fields have the same data type as the EmployeeID and the
Amount fields in the Orders table. Then this sample reads records from the
Orders
DataTable, and writes records to the OrderSummary
DataTable. The OrderSummary
DataTable contains the EmployeeID field and three different aggregates of
the Amount field that are filtered on "EmployeeID<5" and that are grouped on
(and sorted by) EmployeeID. If the
GroupBy argument is blank, the target
DataTable contains only a single record that aggregates all the
input.
Note The filter expression is applied before any aggregate
functionality. To implement HAVING-type functionality, filter the
DataTable that results.
To call the
SelectGroupByInto method, add the following method to the
DataSetHelper class that you created in the "
DataSetHelper Shell Class" section:
public DataTable SelectGroupByInto(string TableName, DataTable SourceTable, string FieldList,
string RowFilter, string GroupBy)
{
/*
* Selects data from one DataTable to another and performs various aggregate functions
* along the way. See InsertGroupByInto and ParseGroupByFieldList for supported aggregate functions.
*/
DataTable dt = CreateGroupByTable(TableName, SourceTable, FieldList);
InsertGroupByInto(dt, SourceTable, FieldList, RowFilter, GroupBy);
return dt;
}
back to
the topTest the Application
- Save, and then compile the DataSetHelper class that you created in the previous sections.
- Follow these steps to create a new Visual C# Windows
Application:
- Start Visual Studio .NET.
- On the File menu, point to New, and then click Project.
- In the New Project dialog box, click Visual C# Projects under Project Types, and then click Windows Application under Templates.
- In Solution Explorer, right-click the solution, and
then click Add Existing Project. Add the DataSetHelper project.
- On the Project menu, click Add Reference.
- In the Add Reference dialog box, click the Projects tab, and then add a reference to the DataSetHelper project to the
Windows Form application.
- In the form designer, drag three Button controls and a DataGrid control from the toolbox to the form. Name the buttons
btnCreateGroupBy,
btnInsertGroupByInto, and
btnSelectGroupByInto. Keep the default name for the DataGrid control (DataGrid1).
- In the form code, add the following using statement to the top of the code window:
using System.Data;
- Add the following variable declarations to the form
definition:
DataSet ds; DataSetHelper.DataSetHelper dsHelper;
- Add the following code to the Form_Load event:
ds = new DataSet();
dsHelper = new DataSetHelper.DataSetHelper(ref ds);
//Create the source table
DataTable dt = new DataTable("Orders");
dt.Columns.Add("EmployeeID", Type.GetType("System.String"));
dt.Columns.Add("OrderID", Type.GetType("System.Int32"));
dt.Columns.Add("Amount", Type.GetType("System.Decimal"));
dt.Rows.Add(new object[] {"Sam", 5, 25.00});
dt.Rows.Add(new object[] {"Tom", 7, 50.00});
dt.Rows.Add(new object[] {"Sue", 9, 11.00});
dt.Rows.Add(new object[] {"Tom", 12, 7.00});
dt.Rows.Add(new object[] {"Sam", 14, 512.00});
dt.Rows.Add(new object[] {"Sue", 15, 17.00});
dt.Rows.Add(new object[] {"Sue", 22, 2.50});
dt.Rows.Add(new object[] {"Tom", 24, 3.00});
dt.Rows.Add(new object[] {"Tom", 33, 78.75});
ds.Tables.Add(dt);
- Add the following code to the btnCreateGroupBy.Click event:
dsHelper.CreateGroupByTable("OrderSummary", ds.Tables["Orders"],
"EmployeeID,count(EmployeeID) Orders,Sum(Amount) OrderTotal,max(Amount) BestOrder,min(Amount) WorstOrder");
dataGrid1.SetDataBinding(ds, "OrderSummary");
- Add the following code to the btnInsertGroupByInto.Click event:
dsHelper.InsertGroupByInto(ds.Tables["OrderSummary"], ds.Tables["Orders"],
"EmployeeID,count(EmployeeID) Orders,sum(Amount) OrderTotal,max(Amount) BestOrder,min(Amount) WorstOrder",
"", "EmployeeID");
dataGrid1.SetDataBinding(ds, "OrderSummary");
- Add the following code to the btnSelectGroupByInto.Click event:
dsHelper.SelectGroupByInto("OrderSummary2", ds.Tables["Orders"],
"EmployeeID,count(EmployeeID) Orders,sum(Amount) OrderTotal,max(Amount) BestOrder,min(Amount) WorstOrder",
"OrderID>10", "EmployeeID");
dataGrid1.SetDataBinding(ds, "OrderSummary2");
- Run the application, and then click each of the buttons. Notice that the DataGrid is populated with the tables and data from the code.
Note You can only click the btnCreateGroupBy and the btnSelectGroupByInto buttons one time. If you click either of these buttons more than
one time, you receive an error message that you are trying to add the same
table two times. Additionally, you must click btnCreateGroupBy before you click btnInsertGroupByInto; otherwise, the destination DataTable is not created. If you click the btnInsertGroupByInto button multiple times, you populate the DataGrid with duplicate records.
back to
the topEnhancement Ideas
- The ColumnName and the DataType properties are the only properties that are copied to the
destination DataTable. You can extend the CreateTable method to copy additional properties, such as the MaxLength property, or you can create new key columns.
- You can expand the functionality so that you can specify
fields from related DataTable objects.
- You can expand the functionality by adding support for
additional aggregate functions, such as the AVG or the STDEV function. If you do this, you must have dynamically created
variables to track intermediate values. To track intermediate values, you can
add member variables to the FieldInfo class, and then use those member variables to store intermediate
values.
back to
the topTroubleshoot
- The fieldname and the alias parts of the field list must
comply with DataColumn naming conventions. The parser also restricts the names, in that
the name must not contain a period (.), a comma (,), or a space ( ).
- If you click a button more than one time, the same table is
added two times to the DataSet. This results in an exception. To work around this problem, you
can add code to the test application to check whether a DataTable of the same name already exists. Alternatively, you can create
the DataSetHelper class without a reference to a DataSet, and then bind the DataGrid.DataSource property directly to the dt variable instead of by using the SetDataBinding method call.
- If the source table uses custom data types (for example, a
class), you must add code to the InsertGroupByInto method to perform a deep copy of the data. Otherwise, only a
reference is copied.
back to
the top