How to group all duplicate object to one list and all unique object to another list from a original list in C#? - asp.net

I have a text file and to read from and convert each line to and object with Id and someText. I would like to group them so that I have two lists: unique list and duplicate list. the data is very big up to hundred of thousand of lines. Which is the best data structure to use? Please provide some sample code in C#. Thanks a lot!
for example:
original list read from text file:
{(1, someText),(2, someText),(3, someText),(3, someText1),(4, someText)}
unique list:
{(1, someText),(2, someText),(4, someText)}
duplicate list:
{(3, someText),(3, someText1)}

Here's an example with LinQ
Random rnd = new Random();
StreamReader sr = new StreamReader("enterYourPathHere");
string line = "";
int cnt = 0; //This will "generate our ids".
List<KeyValuePair<int,string>> values = new List<KeyValuePair<int, string>>();
while ((line = sr.ReadLine()) != null)
{
//You convert the line to your object (using keyvaluepair for testing)
var obj = new KeyValuePair<int, string>(cnt, line);
values.Add(obj);
//Increment the id on with 50% chances
if (rnd.Next(0,1) >0.5) cnt++;
}
var unique = values.GroupBy(x=>x.Key).Distinct().Select(x=>x).ToList();
var duplicates = values.GroupBy(x => x.Key).Where(x => x.Count() > 1).Select(x => x).ToList();

Related

Clone dataset > Change column type > Populate dataset

I have a dataset populated from a database:
dataset_original = new DataSet()
data_adapter.Fill(dataset_original)
and I cloned it:
dataset_cloned = dataset_original.Clone()
I cloned it because 1 of the columns in the original is of type int, and I want to change that to type string:
dataset_cloned.Tables(0).Columns("int_column_name_goes_here").DataType = GetType(String)
Now I need to populate the new dataset with the data from the old dataset. How do I do that?
I am using asp.net 1.1 coded with vb.net.
This simple loop should work (even with OPTION STRICT ON):
Dim dataset_cloned = dataset_original.Clone()
dataset_cloned.Tables(0).Columns("int_column_name_goes_here").DataType = GetType(String)
For i As Int32 = 0 To dataset_original.Tables.Count - 1
Dim tbl_original As DataTable = dataset_original.Tables(i)
Dim tbl_cloned As DataTable = dataset_cloned.Tables(i)
For Each row As DataRow In tbl_original.Rows
tbl_cloned.Rows.Add(row.ItemArray)
Next
Next
assuming you have only one table in the data set, you can do something like this.
int ColumnIndex = 0; //Column index of your data you want to copy
for (int i = 0; i < dataset_original.Tables[0].Rows.Count; i++)
{
dataset_cloned.Tables[0].Rows[i].SetField(ColumnIndex, dataset_original.Tables[0].Rows[ColumnIndex].ItemArray[0].ToString());
}
in the same for loop you can copy remaining columns data

Compare two strings and return how much words are same using Asp.net

How could I get after Comparing two strings and return how much words are same using Asp.net.
I have written some code here but it return only length of string:
string x = "Sabih Khan Afridi Sabih Khan Afridi";
string y = "Sabih Afridi";
int z = x.Length; int t = y.Length;
Label1.Text = "Total lengths: !st->" +z.ToString()+" <<>> 2nd-"+t;
string[] common = x.Split().Intersect(y.Split()).ToArray();
int count = common.Length;
plagiarism detection is not simple as above, you better use one of library for this, like Anti-Plagiaris or moss. they are open source and you can check the implementation also
Use Intersect to get similar words.
IEnumerable<string> listX = x.Split(' ').Distinct();
IEnumerable<string> listY = y.Split(' ').Distinct();
var similarWords = listX.Intersect(listY);
int numberOfSimilarWords = similarWords.Count();
Update:
To Compare words from two files.
You just need to read those files
var firstFile = File.ReadAllText(#"C:\firstfile.txt", Encoding.ASCII).Split(' ').ToList();
var secondFile =File.ReadAllText(#"C:\secondfile.txt", Encoding.ASCII).Split(' ').ToList();
var similarwords=firstFile.Intersect(secondFile);

XML to Datatable conversion fails

I want to convert an XML string to Datatable.String is like this
<TextstringArray>
<values>
<value>athul</value>
<value>aks#phases.dk</value>
<value>1</value>
</values>
<values>
<value>arun</value>
<value>am#phases.dk</value>
<value>1</value>
</values>
<values>
<value>ajmal</value>
<value>am#phases.dk</value>
<value>1</value>
</values>
</TextstringArray>
I have tried something like this
StringReader theReader = new StringReader(invitations);
DataSet theDataSet = new DataSet();
theDataSet.ReadXml(theReader);
But the datatset comes out with wrongly formatted data.
like all value elements are coming in single column.I want them in three columns.one for first and so on.(Xml is getting in to table but not the xml structure)
In order to achieve your goal, the XML should have the following structure:
<TextstringArray>
<values>
<value1>athul</value1>
<value2>aks#phases.dk</value2>
<value3>1</value3>
</values>
<values>
<value1>arun</value1>
<value2>am#phases.dk</value2>
<value3>1</value3>
</values>
<values>
<value1>ajmal</value1>
<value2>am#phases.dk</value2>
<value3>1</value3>
</values>
</TextstringArray>
This will produce one single datatable, where each values element will be the source of a data row, while each child element of it (value1,value2 etc) will be read as a column.
A workaround would be:
StringReader theReader = new StringReader(File.ReadAllText(invitations));
DataSet theDataSet = new DataSet();
theDataSet.ReadXml(theReader);
var valueIdsDatatable = theDataSet.Tables[0];
var valueDatatable = theDataSet.Tables[1];
// detect the maximum number of columns
var maxColumns = valueDatatable.AsEnumerable()
.GroupBy(i => i["values_Id"]).Max(i => i.Count());
// create the result DataTable
var resultDataTable = new DataTable();
// add dynamically the columns
for (int i = 0; i < maxColumns; i++)
{
resultDataTable.Columns.Add("property" + i);
}
// add the rows
foreach (DataRow valueId in valueIdsDatatable.Rows)
{
var newRow = resultDataTable.NewRow();
var currentRows = valueDatatable.Select("values_id = " + valueId[0]);
for (int i = 0; i < currentRows.Length; i++)
{
newRow[i] = currentRows[i][0];
}
resultDataTable.Rows.Add(newRow);
}
// TODO: use the resultDataTable
You could use some LINQy goodness to group and project into a new data-structure.
Note that the following code expects there to be exactly 3 values in every group, and relies on the input being consistent. If this is not the case you will need to adapt it to your environment.
StringReader theReader = new StringReader(invitations);
DataSet theDataSet = new DataSet();
theDataSet.ReadXml(theReader);
// Get the table we are interested in.
var dt = theDataSet.Tables["value"];
// Group by the "values_Id" field. This is what logically
// relates each <value>.
// Then, project out a new type, assuming that the first row
// holds the username, the second holds the email address,
// and the third holds "some thing else".
var rows = dt.Rows.
Cast<DataRow>().
GroupBy(dr => dr["values_Id"]).
Select(row =>
new
{
RowId = row.Key,
UserName = row.ElementAt(0)["value_Text"],
UserEmail = row.ElementAt(1)["value_Text"],
UserOtherValue = row.ElementAt(2)["value_Text"]
});
foreach (var row in rows)
{
Console.WriteLine("Row " + row.RowId);
Console.WriteLine(" UserName: " + row.UserName);
Console.WriteLine(" Email: " + row.UserEmail);
Console.WriteLine(" OtherValue: " + row.UserOtherValue);
}
Produces the following output:
Row 0
UserName: athul
Email: aks#phases.dk
OtherValue: 1
Row 1
UserName: arun
Email: am#phases.dk
OtherValue: 1
Row 2
UserName: ajmal
Email: am#phases.dk
OtherValue: 1

Extracting values of textbox in array?

I have dynamically created textbox in asp.net. Now i am extracting the values through following code.
string[] sublist = new string[] { };
int[] maxmarkslist = new int[] { };
int i;
for (i = 0; i < Convert.ToInt32(Label15.Text); i++)
{
string sub = "subject" + i;
string marks = "maxmarks" + i;
TextBox subject = (TextBox)PlaceHolder1.FindControl(sub);
TextBox maxmarks = (TextBox)PlaceHolder1.FindControl(marks);
sublist[i] = subject.Text;
maxmarkslist[i] = Convert.ToInt32(maxmarks.Text);
}
But I getting error "Index was outside the bounds of the array" for the below two lines:
sublist[i] = subject.Text;
maxmarkslist[i] = Convert.ToInt32(maxmarks.Text);
When I debugged it, values are coming in subject.Text and maxmarks.Text but not going to array.
Did I define the array in a wrong way?
You define both the arrays as empty arrays. So you will get index out of bound erros if you try to index into those.
Arrays are not dynamically expanding. If you want that, use a collection type and may be later convert to an array.
Try this:
int length = Convert.ToInt32(Label15.Text);
string[] sublist = new string[length-1];
int[] maxmarkslist = new int[length-1];
for (int i = 0; i < length; i++)
{
string sub = "subject" + i;
string marks = "maxmarks" + i;
TextBox subject = (TextBox)PlaceHolder1.FindControl(sub);
TextBox maxmarks = (TextBox)PlaceHolder1.FindControl(marks);
sublist[i] = subject.Text;
maxmarkslist[i] = Convert.ToInt32(maxmarks.Text);
}
Or here is how to do this with a collection (List) type:
int length = Convert.ToInt32(Label15.Text);
List<string> sublist1 = new List<string>();
List<int> maxmarkslist1 = new List<int>();
for (int i = 0; i < Convert.ToInt32(Label15.Text); i++)
{
string sub = "subject" + i;
string marks = "maxmarks" + i;
TextBox subject = (TextBox)PlaceHolder1.FindControl(sub);
TextBox maxmarks = (TextBox)PlaceHolder1.FindControl(marks);
sublist1.Add(subject.Text);
maxmarkslist1.Add(Convert.ToInt32(maxmarks.Text));
}
string[] sublist = sublist1.ToArray();
int[] maxmarkslist = maxmarkslist1.ToArray();
Note with collections you dont have to specify the size upfront. But keep adding items to it as it can expand as needed. But arrays can not do this.
Your string[] sublist = new string[] { }; is a shortcut method where you create and initialize the array. In that you don't have to specify the size, but compiler will count the elements between {} and set the size appropriately. In your case since there are no elements inside {} it will create an empty array.
string[] sublist = new string[100];
int[] maxmarkslist = new int[100];
Put this..replace 100 with the max possible value of your loop...but this is not a good practice...will come back to this thread if i found something better...

Find a vaule from columns of a dataset asp.net

I Want to find the value from dataset column Id.
here is the dataset
Id Value
1 football
2 Tennis
3 Cricket
If any one is absent in Column then i want to append that particular value in the dataset
I guess that is a DataTable inside a DataSet. First you need to query if the id is in the DataTable:
var dataTable = dataSet.Tables[0]; //For this example I'm just getting the first DataTable of the DataSet, but it could be other.
var id = 1;
var value = "football";
//Any(...) will return true if any record matches the expression. In this case, the expression is if a Id Field of the row is equals to the provided id
var contained = dataTable.AsEnumerable().Any(x =>x.Field<int>("Id") == id);
Then, if it's not there, add a new row:
if(!contained)
{
var row = dataTable.NewRow();
row["Id"] = id;
row["Value"] = value;
dataTable.Rows.Add(row);
}
Hope it helps
First you should use a loop to see if your dataset column 'id' contains the value. If the id is not existing then:
DataRow newrow = ds.Tables[0].NewRow(); //assuming ds is your dataset
newrow["id"] = "your new id value";
newrow["value"] = "your new value";
ds.Tables[0].Rows.Add(newrow);

Resources