I am trying to compare two data tables using Linq. Very simple tables, just one column but have about 44,000 rows. I use the following but when I trace it, when it gets to if (dr.Any()), it just sits there and next line, or exception, is never executed:
public static DataTable GetTableDiff(DataTable dt1, DataTable dt2)
{
DataTable dtDiff = new DataTable();
try
{
var dr = from r in dt1.AsEnumerable() where !dt2.AsEnumerable().Any(r2 => r["FacilityID"].ToString().Trim().ToLower() == r2["FacilityID"].ToString().Trim().ToLower()) select r;
if (dr.Any())
dtDiff = dr.CopyToDataTable();
}
catch (Exception ex)
{
}
return dtDiff;
}
I set max request length in web.config to make sure that is not an issue, but no change:
<system.web>
<compilation debug="true" targetFramework="4.5" />
<httpRuntime targetFramework="4.5" maxRequestLength="1048576" />
I don't think 44,000 rows is too big, is it?
Join tables O(N1+N2) instead of doing O(N1*N2) search (currently for each row in dt1 you are scanning all rows in dt2):
var diff = from r1 in dt1.AsEnumerable()
join r2 in dt2.AsEnumerable()
on r1.Field<string>("FacilityID").Trim().ToLower()
equals r2.Field<string>("FacilityID").Trim().ToLower() into g
where !g.Any() // get only rows which do not have joined rows from dt2
select r1;
With join you will also calculate each key (facility id) only once.
Another option is creating simple row comparer:
public class FacilityIdComparer : IEqualityComparer<DataRow>
{
public bool Equals(DataRow x, DataRow y) => GetFacilityID(x) == GetFacilityID(y);
public int GetHashCode(DataRow row) => GetFacilityID(row)?.GetHashCode() ?? 0;
private string GetFacilityID(DataRow row)
=> row.Field<string>("FacilityID")?.Trim().ToLower();
}
Then getting new rows is one liner with LINQ Except method:
var diff = dt2.AsEnumerable().Except(dt1.AsEnumerable(), new FacilityIdComparer());
and it will work for searching intersections as well
I would use a different, more leightweight, approach because you just take rows from one table and you want only those with a new FacilityId:
public static DataTable GetTableDiff(DataTable dtNew, DataTable dtOld)
{
DataTable dtDiff = dtNew.Clone(); // no data only columns and constraints
var oldFacilityIds = dtOld.AsEnumerable().Select(r => r.Field<string>("FacilityID").Trim());
var oldFacilityIDSet = new HashSet<string>(oldFacilityIds, StringComparer.CurrentCultureIgnoreCase);
var newRows = dtNew.AsEnumerable()
.Where(r => !oldFacilityIDSet.Contains(r.Field<string>("FacilityID").Trim()));
foreach (DataRow row in newRows)
dtDiff.ImportRow(row);
return dtDiff;
}
Related
I tried to count num of rows in grid in runtime with this code
FormRun caller;
FormDataSource fds;
QueryRun queryRun;
int64 rows;
fds = caller.dataSource();
query = fds.query();
queryRun = new QueryRun(query);
rows = SysQuery::countTotal(queryRun); //this returns -1587322268
rows = SysQuery::countLoops(queryRun); //this returs 54057
The last line of code is closest to what i need because there are 54057 lines but if i add filters it still returns 54057.
I want logic to get the number rows that grid has in the moment of calling the method.
Your query has more than one datasource.
The best way to explain your observation is to look at the implementation of countTotal and countLoops.
public client server static Integer countTotal(QueryRun _queryRun)
{
container c = SysQuery::countPrim(_queryRun.pack(false));
return conpeek(c,1);
}
public client server static Integer countLoops(QueryRun _queryRun)
{
container c = SysQuery::countPrim(_queryRun.pack(false));
return conpeek(c,2);
}
private server static container countPrim(container _queryPack)
{
...
if (countQuery.dataSourceCount() == 1)
qbds.addSelectionField(fieldnum(Common,RecId),SelectionField::Count);
countQueryRun = new QueryRun(countQuery);
while (countQueryRun.next())
{
common = countQueryRun.get(countQuery.dataSourceNo(1).table());
counter += common.RecId;
loops++;
}
return [counter,loops];
}
If your datasource contains one datasource it adds count(RecId).
countTotal returns the number of records.
countLoops returns 1.
Pretty fast, as fast as the SQL allows.
If your datasource contains more than one datasource it does not add count(RecId).
countTotal returns the sum of recIds (makes no sense).
countLoops returns the number of records.
Also countLoops is slow if there are many records as they are counted one by one.
If you have two datasources and want a fast count, you are on your own:
fds = caller.dataSource();
queryRun = new QueryRun(fds.queryRun().query());
queryRun.query().dataSourceNo(2).joinMode(JoinMode::ExistsJoin);
queryRun.query().dataSourceNo(1).clearFields();
queryRun.query().dataSourceNo(1).addSelectionField(fieldnum(Common,RecId),SelectionField::Count);
queryRun.next();
rows = queryRun.getNo(1).RecId;
The reason your count did not respect the filters was because you used datasource.query() rather than datasource.queryRun().query(). The former is the static query, the latter is the dynamic query with user filters included.
Update, found some old code with a more general approach:
static int tableCount(QueryRun _qr)
{
QueryRun qr;
Query q = new Query(_qr.query());
int dsN = _qr.query().dataSourceCount();
int ds;
for (ds = 2; ds <= dsN; ++ds)
{
if (q.dataSourceNo(ds).joinMode() == JoinMode::OuterJoin)
q.dataSourceNo(ds).enabled(false);
else if (q.dataSourceNo(ds).joinMode() == JoinMode::InnerJoin)
{
q.dataSourceNo(ds).joinMode(JoinMode::ExistsJoin);
q.dataSourceNo(ds).fields().clearFieldList();
}
}
q.dataSourceNo(1).fields().clearFieldList();
q.dataSourceNo(1).addSelectionField(fieldNum(Common,RecId), SelectionField::Count);
qr = new QueryRun(q);
qr.next();
return any2int(qr.getNo(1).RecId);
}
I have a DataTable dtStudent which have columns like stuid,stuname,stuclass and so on like that so many columns are there and assume i have 10 rows.we dont know how many columns will be present in datatable it may vary depends on functionality.
Now how do i convert datatable to List of string,string,string,string, like that without using List of class becuase unknown columns in the datatable.
Please help me.
You need to iterate through DataTable rows and collect data.
You can use DataTable.Columns for getting column names and using them to get row cells.
List<string[]> result = new List<string[]>();
foreach (DataRow row in rightsTable.Rows)
{
string[] cells = new string[rightsTable.Columns.Count];
for (int i = 0; i < rightsTable.Columns.Count; i++)
{
cells[i] = row[rightsTable.Columns[i].ColumnName].ToString();
}
result.Add(cells);
}
Update:
OK now. I have forgotten about ItemArray property. It simply returns the array of object stored in a row. You just need to execute ToString() on it.
You can use this one instead as it is more convenient and fast.
List<string[]> result = new List<string[]>();
foreach (DataRow row in dt.Rows)
{
list.Add(row.ItemArray.Select(x => x.ToString()).ToArray());
}
Or even shorter using LINQ:
List<string[]> result = dt.Rows
.Cast<DataRow>()
.Select(row => row.ItemArray
.Select(x => x.ToString())
.ToArray())
.ToList();
I have csv file like this:
I need to show this csv file with gridview. But I must change format like this:
I must select distinct just date and mount columns and use date values on gridview columns.
How can I use values of csv file for Gridview columns?
Assuming that reading the CSV file is not an issue and you have already something like a List<ClassName>, DataTable or List<string[]>. I'm presuming that it's a List<String[]> where the first "column" is Date, the second Mount and the last % in my following approach.
You need real DateTimes and ints to be able to sum percents by date:
var formatProvider = new CultureInfo("de-DE"); // seems to be the correct format
var mountGroups = listOfStringArray
.Select(arr => new
{
Date = DateTime.Parse(arr[0].Trim(), formatProvider).Date,
Mount = arr[1].Trim(),
Percent = int.Parse(arr[2].Trim())
})
.GroupBy(x => x.Mount);
Now you have grouped by Mount, you just need to sum the percents for every day. You can use a DataTable as datasource for the GridView. Here's code that creates the table with the dynamic columns for every day:
var dataSource = new DataTable();
dataSource.Columns.Add("Mount");
var lastWeekColumns = Enumerable.Range(0, 7)
.Select(d => new DataColumn(DateTime.Today.AddDays(-6 + d).ToShortDateString(), typeof(int)))
.ToArray();
dataSource.Columns.AddRange(lastWeekColumns);
Following loop executes the LINQ query and fills the table:
foreach(var grp in mountGroups)
{
DataRow row = dataSource.Rows.Add();
row.SetField("Mount", grp.Key); // because: GroupBy(x => x.Mount);
foreach(DataColumn dayCol in lastWeekColumns)
{
DateTime day = DateTime.Parse(dayCol.ColumnName, formatProvider);
int sumPercent = grp.Where(x => x.Date == day)
.Select(x => x.Percent)
.DefaultIfEmpty(0) // fallback value for missing days
.Sum();
row.SetField(dayCol, sumPercent);
}
}
Now you just need to use it as datasource (AuthoGenerateColumns set to true)
grid.DataSource = dataSource;
grid.DataBind();
I have a gridview that i want to sort. I wrote the following method for it:
private void SortGridView(string sortExpression, string direction)
{
var constr = new AdminRequirementEF();
string sort = string.Concat("it.", sortExpression, " ", direction);
int pageSize = Convert.ToInt32(ddPageSize.SelectedItem.Text);
var results = constr.Projects;
int totalRecords = results.Count();
this.PopulatePager(totalRecords, pageIndex);
var sortedResults = constr.Projects.OrderBy(sort).Skip((pageIndex - 1) * pageSize).Take(pageNum).ToList();
grdMain.DataSource = sortedResults;
grdMain.DataBind();
}
The problem is sorting is applied on totalrecords not on per page filtered records. I want to use OrderBy(sort) after applying skip and take but it gives me an error stating skip can not be applied before orderby clause. Any help will be much appreciated.
You can get the constr.Projects collection sorted on its primary key
var results = constr.Projects.OrderBy(p => p.ProjectId)
and then apply the skip and take on the 'results' collection with sorting.
results = results.Skip((pageIndex - 1) * pageSize).Take(pageNum).OrderBy(sort).ToList();
This way you will get the records for particular page sorted as required.
I have a Linq List in which have data more than one tables and these tables has related to each other. in this list it has some another table list property. Like table1 and table2 has related to each other and we have a data in list of table1 and in this list it has automatically table2 data.
Now i want to convert this List into a XML but it throws an error i.e. circular reference error, So now i want to convert this list into a dataset and by using this dataset i can generate a xml.
So can anyone provide us code to generate multiple tables dataset from a List....
or
convert list to xml code....
or
any other helpful comments............
You really haven't given a lot of information to work with, but I'll take a stab at a basic answer.
Given this class structure:
public class Table1
{
public string Value;
public Table2 Table2;
}
public class Table2
{
public string Value;
}
I can create this list:
var lsttable1 = new List<Table1>()
{
new Table1()
{
Value = "Foo1",
Table2 = new Table2()
{
Value = "Bar1",
},
},
new Table1()
{
Value = "Foo2",
Table2 = new Table2()
{
Value = "Bar2",
},
},
};
Now, I might want to convert this into XML that looks like this:
<Table1s>
<Table1 Value="Foo1">
<Table2 Value="Bar1" />
</Table1>
<Table1 Value="Foo2">
<Table2 Value="Bar2" />
</Table1>
</Table1s>
Here's the LINQ code that does it:
var xd =
new XDocument(
new XElement(
"Table1s",
from t1 in lsttable1
select new XElement(
"Table1",
new XAttribute("Value", t1.Value),
new XElement(
"Table2",
new XAttribute("Value", t1.Table2.Value)
)
)
)
);
Is that the kind of thing that you wanted?