Turn range of numbers into a comma divided list (Classic ASP) [closed] - asp-classic

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
Seems like a really common question, but have yet to find a Classic ASP example.
I have data presented like the following from the database we have inherited:
120-128,10,20,30,12-19
I need to be able to convert this into a comma divided list, in consecutive order, pulling not only the numbers present, but the numbers within the ranges (specified by the -)
So in the above example, I would expect the output of:
10,12,13,14,15,16,17,18,19,20,30,120,121,122,123,124,125,126,127,128
THEN I want to be able to store that result as a single variable, so I can do more work with it later.
I have found Python Methods, C#, Javascript, PHP etc, but not one for Classic ASP.
Can anyone help?
FYI, there would never be any duplicate numbers, each number will be unique.

The basic steps to do this are
Split your initial list by commas
Iterate through each of those items, checking if there's a hyphen
If there's a hyphen, then loop from the start to end, and add that value to an array, if there is no hyphen, then just add the value
At that point you have a list of all values, unsorted, and not unique.
In Classic ASP, you can use an Arraylist to help with the sorting and uniqueness. Create two arraylist objects. One will contain the non-unique list, and then the other will contain your final unique list.
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<body>
<p>
<%
v="120-128,10,20,30,12-19,13-22" 'our original string to parse
set uniqueList=CreateObject("System.Collections.ArrayList") 'final unique list
set mynumbers=CreateObject("System.Collections.ArrayList") 'a working list
'first split the values by the comma
splitCom=Split(v, ",")
'now go through each item
for itemnumber=0 to ubound(splitCom)
itemToAdd=splitCom(itemnumber)
if InStr(itemToAdd, "-")>0 then 'if the item has a hyphen, then we have a range of numbers
rangeSplit=Split(itemToAdd, "-")
for itemToAdd=rangeSplit(0) to rangeSplit(1)
mynumbers.Add CInt(itemToAdd)
next
else
mynumbers.Add Cint(itemToAdd) 'otherwise add the value itself
end if
next
'at this point, mynumbers contains a full list of all your values, unsorted, and non-unique.
mynumbers.sort 'sort the list. Can't be any easier than this
'output the non-unique list, and build a unique list while we are at it.
Response.Write("Non-unique list<br />")
for each item in mynumbers 'iterate through each item
Response.Write(item & "<br />") 'print it
if (not uniqueList.Contains(item)) then 'is the value in our unique list?
uniqueList.Add(item) 'no, so add it to the unique list
end if
next
'now output the unique list.
Response.Write("<br />Unique list<br />")
for each item in uniqueList
Response.Write(item & "<br />")
next
%>
</p>
</body>
</html>

Related

Eliminating empty key value pairs from dynamic column

I have an update policy which populates a target table column of dynamic type. The update policy logic for populating the dynamic column is as:-
project target_calculated_column = pack("key1",src_col1,
"key2",src_col2,
"key3",src_col3,
.
.
"keyN",src_colN)
The columns src_col1,src_col2,...,src_colN are fixed number of columns coming from a specific table which is source for the update policy. These columns are of various datatypes, mostly some of them are strings and others are integers. Also the main thing here is that these columns may or may not contain any values for the input rows. What this means is that for integer columns values could be null or in case of string columns it could be blank. Now the issue here is that the update policy function is obviously written before hand and hence it can't know which rows will have nulls or blanks etc. That's something that will only be known when update policy starts running. So when the update policy starts running we end up with the following type of data in the target column target_calculated_column , showing one sample value from target row:-
{
"key1":"sometext",
"key2":30,
"key3":null,
"key5":"hello",
"key6":"",
"key7":112,
"key8":"",
"key9":"",
.
.
"keyN":10
}
This demonstrates the problem. I don't want to keep the key value pairs as part of target_calculated_column which are empty (nulls, blanks etc.). I think what I am asking for is a conditional version of pack() that can ignore key value pairs with empty values, but I don't think such an option is there. Is there way I can postprocess target_calculated_column so that I can eliminate such key value pairs? Basically in case of this example I should be getting the following output:-
{
"key1":"sometext",
"key2":30,
"key5":"hello",
"key7":112,
.
.
"keyN":10
}
pack_all([ignore_null_empty]) function allows you to ignore nulls/empty values. If you want to remove some columns you can use the bag_remove_keys() function. The pack() function itself does not provide this option.

Extract text from word and convert into Dataframe

I need to extract a specific portion of text that is in a Word (.docx). The document has the following structure:
Question 1:
How many ítems…
 two
 four
 five
 ten
Explanation:
There are four ítems in the bag.
Question 2:
How many books…
 two
 four
 five
Explanation:
There are four books in the bag.
With this information I have to create a Dataframe like this one:
I'm able to open the document, extract the text and print the lines starting with  , but I'm not able to extract the rest of the string of interest and create the Dataframe.
My code is:
import docx
def getText(filename):
doc = docx.Document(filename)
fullText = []
for para in doc.paragraphs:
fullText.append(para.text)
return '\n'.join(fullText)
text=getText('document.docx')
text
strings = re.findall(r" (.+)\n", text)
Any help?
Thanks in advance
I would suggest you expand your regular expression to include all of the information you need. In this case I think you'll need two passes - one to get each question, and a second to parse the possible answers.
Take a look at your source text and break it down into the parts you need. Each item starts with Question n:, then a line for the actual questions, multiple lines for each possible response, followed by Explanation and a line for the explanation. We'll use the grouping operator to extract the parts on interest.
The Question line can be described by the following pattern:
"Question ([0-9]+):\n"
The line that represents the actual question is just text:
"(.+)\n"
The collection of possible responses is a series of lines beginning with a special character (I've replaced it with '*' because I can't tell what character it is from the post), (allowing for possible whitespace)
\*\s*.+\n
but we can get the whole list of them using a combination of grouping including the non-capturing group:
((?:\*\s*.+\n)+)
That causes any number of matching lines to be captured as a single group.
Finally you have "Explanation" possibly preceded by some whitespace, and followed by a line of text:
\s*Explanation:\n(.+)\n
If we put these all together, our regex pattern is
r"Question\s+([0-9]+):\n(.*)\n((?:\*\s*.+\n)+)\s*Explanation:\n(.+)\n"
Parsing this:
patt = r"Question\s+([0-9]+):\n(.*)\n((?:\*\s*.+\n)+)\s*Explanation:\n(.+)\n"
matches = re.findall(patt, text)
yields:
[('1',
'How many ítems…',
'* two\n* four\n* five\n* ten\n',
'There are four ítems in the bag.'),
('2',
'How many books…',
'* two\n* four\n* five\n',
'There are four books in the bag.')]
Where each entry is a tuple. The 3rd item in each tuple is a text of all of the answers as a group, which you'll need to further break down.
The regex to match your answers (using the character '*') is:
\*\s*(.+)\n
Grouping it to eliminate the character, we can use:
r"(?:\*\s*(.+)\n)"
Finally, using a list comprehension we can replace the string value for the answers with a list:
matches = [ tuple([x[0],x[1],re.findall(r"(?:\*\s*(.+)\n)", x[2]),x[3]) for x in matches]
Yielding the result:
[('1',
'How many ítems…',
['two', 'four', 'five', 'ten'],
'There are four ítems in the bag.'),
('2',
'How many books…',
['two', 'four', 'five'],
'There are four books in the bag.')]
Now you should be prepared to massage that into your dataframe.

Replacing multiple string intervals in R

I am currently working on a data sat which has two header rows (The first one acting as overall category description and the second one containing subcategories. And it happens to be that both contain various <text> intervals. For example:
In the first row (column names of the data frame), i have a cell that contains:
- text... <span style=\"text-decoration: underline;\">in the office</span> on the activities below. Total must add up to 100%. <br /><br />
The second row contains multiple cells with:
- text <strong>
- text </strong>
Now, I was able to work out of how to remove all <text> intervals in the second row through:
data[1,] = gsub("<.*>", "", data[1,])
However, for the column names row, if I use:
colnames(data) = gsub("<.*>", "",colnames(data))
I end up just with "text", which I don't want. Due to the fact, that I still want to have:
text... in the office on the activities below. Total must add up to 100%
If some one would have an idea of how to solve it. I would really appreciate it.
Thanks!
You can get what you need by changing the regular expression you are using with the following:
colnames(data) <- gsub("<[^>]+>", "",colnames(data))
This will remove anything between opening and closing tags (including the tag). That should give you what you want.
Your current regex is greedy and is consuming everything in between the first opening bracket and last closing bracket. One quick fix would be to make your regex non greedy by using ?:
data[1,] = gsub("<.*?>", "", data[1,])
Note that using regex to parse HTML generally is not a good idea. If you plan on doing anything with nested content then you should consider using an R package which can parse HTML content.
Demo

How do I dynamically append text to a Literal Class in asp.net instead of overwriting it?

Using asp.net and vb, I am trying to dynamically add names to a Literal Class based on the date associated with the name. I have created two Lists, listDate and listShiftName, with the dates and the associated names respectively. Each name has a date associated with it, but there are more than one name for each date, so when I try to add them, it just overwrites each name so only the last name in the list appears. Here is the code I have so far for adding the names:
For i = 0 To listDate.Count - 1
If listDate(i) = DateTime.Today.ToShortDateString Then
litToday.Text = listShiftName(i)
End If
Next i
What do I need to change in the line litToday.Text = listShiftName(i) to in order for it to append each name instead of overwriting the previous names with the last name in the list? I'm a N00B to asp.net and vb, so please excuse my ignorance.
Try this,
litToday.Text &= " " & listShiftName(i);

What would cause Microsoft Jet OLEDB SELECT to miss a whole column?

I'm importing an .xls file using the following connection string:
If _
SetDBConnect( _
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & filepath & _
";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1""", True) Then
This has been working well for parsing through several Excel files that I've come across. However, with this particular file, when I SELECT * into a DataTable, there is a whole column of data, Item Description, missing from the DataTable. Why?
Here are some things that may set this particular workbook apart from the others that I've been working with:
The workbook has a freeze pane consisting of the first 24 rows (however, all of these rows appear in the DataTable)
There is some weird cell highlighting going on throughout the workbook
That's pretty much it. I can't see anything that would make the Item Description column not import correctly. Its data is comprised of all Strings that really have no special characters apart from &. Additionally, each data entry in this column is a maximum of 20 characters. What is happening? Is there any other way I can get all of the data? Keep in mind I have to use the original file and I cannot alter it, as I want this to ultimately be an automated process.
Thanks!
Some initial thoughts/questions: Is the missing column the very first column? What happens if you remove the space within "Item Description"? Stupid question, but does that column have a column header?
-- EDIT 1 --
If you delete that column, does the problem move to another column (the new index 4), or is the file complete. My reason for asking this -- is the problem specific to data in that column/header, or is the problem more general, on index 4.
-- EDIT 2 --
Ok, so since we know it's that column, we know it's either the header, or the rows. Let's concentrate on rows for now. Start with that ampersand; dump it, and see what happens. Next, work with the first 50% of rows. Does deleting that subset affect anything? What about the latter 50% of rows? If one of those subsets changes the result, you ought to be able to narrow it down to an individual row (hopefully not plural) by halfing your selection each time.
My guess is that you're going to find a unicode character or something else funky is one of the cells. Maybe there's a formula or, as you mentioned, some of that "weird cell highlighting."
It's been years since I worked with excel access, but I recall some problems with excel grouping content into some areas that would act as table inside each sheet. Try copy/paste the content from the problematic sheet to a new workbook and connect to that workbook. If this works you may be able to investigate a bit further about areas.

Resources