There is requirement of Excel file upload on .aspx page and reading data and store that in database. In the excel user can format specific word or sentence in a any cell. We want preserve those formatting in form of HTML tag. So we want to reach data with formatting with html tag. How it can be achieved.
Probaly the best way would be to use the Excel interop assemblies under Microsoft.Office.Interop.Excel namespace. The code would be something like this:
Excel.Application excel = new Excel.Application();
excel.Workbooks.Open(fileName);
Excel.Worksheet activeWorksheet = ExcelApp.ActiveSheet;
for (int i = 1; i < 100; i++){
for (int j = 1; j < 100; j++){
Excel.Range currentCell = activeWorksheet.Cells[i, j];
// formating
var fontFamily = currentCell.Font.Name;
var italics = currentCell.Font.Italic;
var color = currentCell.Font.Color;
}
}
This opens an excel file and loops trough first 99 rows and columns.
But this could be too intensive since it would open Excel for each document - not sure what kind of performance is required. There are other libraries available that offer simple reading and writing to Excel, but I'm not sure if they offer reading the formats and things like that. You can find some more info about those tools here: Import and export excel. I just checked and it seems EPPPlus support cell styling, so that might be an alternative.
Related
I have an excel file that contains all the filenames of the Images. The path of these images are stored in an Observable Collection via <File> class which came from the folder that contains all of the images. My goal is to create a hyperlink of these filenames by matching it through the pool of image file collection.
I would like to ask if how can I iterate faster through a large collection of file classes in order to get their paths easily.
For example:
Image name from Excel :
ABC_0001
The Full path from the collection must be:
C:\Users\admin\Desktop\Images\ABC_0001.jpg
In order to get their full path, I perform the iteration through Stream.
My procedures:
Extract data using Apache POI.
Stream through the Image Collection by converting each data into
their base filenames vs extracted data.
Get the result and store the fullpath on the object via
getAbsolutePath().
Code:
//storage during iteration
ObservableList<DetailedData> dataCollection = FXCollections.observableArrayList()
//Image collection containing over 13k Images listed via commons-io
ObservableList<File> IMAGE_COLLECTION = FXCollections.observableArrayList(FileUtils.listFiles(browsedFOLDER, new String[]{"JPG", "JPEG", "TIF", "TIFF", "jpg", "jpeg", "tif", "tiff"}, true));
//Sheet data
Sheet sheet1 = wb.getsheetAt(0);
for (Row row: sheet1)
{
DetailedData data = new DetailedData();
//extracted data from excel
String FILENAME = row.getCell(0,Row.MissingCellPolicy.CREATE_NULL_AS_BLANK).getStringCellValue();
//to be filled up based on stream result.
String IMAGE_SOURCE = null;
//stream code with the help of commons-io
File IMAGE = IMAGE_COLLECTION.stream().filter(e -> FilenameUtils.getBaseName(e.getName()).toLowerCase().equals(FILENAME.toLowerCase())).findFirst().orElse(null);
if (IMAGE != null)
IMAGE_SOURCE = IMAGE.getAbsolutePath();
data.setFileName(FILENAME);
data.setFullPath(IMAGE_SOURCE);
dataCollection.add(data);
}
Result:
Excel rows = 9,400
Image Files = 13,000
Iteration Time = 120,000ms
Are the results should appear normal or it can become faster?
I tried using parallelStream() and the results went faster but it consumes higher CPU usage.
This code should speed your code up a lot, but there are a few questions about your code.
ObservableList<DetailedData> dataCollection = FXCollections.observableArrayList() Why are you using ObservableList? Why is this a list of DetailedData and not File. Given that detailed data has setFileName and setFullPath. File already has these.
ObservableList<File> IMAGE_COLLECTION = FXCollections.observableArrayList(FileUtils.listFiles(browsedFOLDER, new String[]{"JPG", "JPEG", "TIF", "TIFF", "jpg", "jpeg", "tif", "tiff"}, true)); Why ObservableList?
These two are small things, but I am curious.
So what I think you should do is use a Map. Your code should look something like the code below.
//storage during iteration
List<DetailedData> dataCollection = new ArrayList();
//Image collection containing over 13k Images listed via commons-io
List<File> IMAGE_COLLECTION = new ArrayList(FileUtils.listFiles(new File("C:\\Users\\blj0011\\Pictures"), new String[]{"JPG", "JPEG", "TIF", "TIFF", "jpg", "jpeg", "tif", "tiff"}, true));
//Use this to map file name to file
Map<String, File> map = new HashMap();
//Use this to add data to the map
IMAGE_COLLECTION.forEach((file) -> {map.put(file.getName().substring(0, file.getName().lastIndexOf(".")).toLowerCase(), file);});
for (Row row: sheet1)
{
//extracted data from excel
String FILENAME = row.getCell(0,Row.MissingCellPolicy.CREATE_NULL_AS_BLANK).getStringCellValue();
//If the map contains the file name, create `DetailedData` object. Then set data. Then add object to datacollection list.
if (map.containsKey(FILENAME.toLowerCase()))
{
DetailedData data = new DetailedData();
data.setFileName(FILENAME);
data.setFullPath(map.get(FILENAME.toLowerCase()).getAbsolutePath());
dataCollection.add(data);
}
}
Comments in the code
I still believe this could be cleaned up a little more if you used List<File> dataCollection = new ArrayList()
If you really want to speed up your search, you should try not to do things repeatedly which could just be done once. For example you could use two loops. The first to prepare your search and the second to actually do the search. Inside your filter you call FilenameUtils.getBaseName and two time a conversion to lower case. It would be better to do these things only once in the first loop and store the resulting Strings in a list. In the second loop you then do the search on this list.
I am also wondering why you use ObservableLists here. A simple List would do as well.
I've tested another approach in this slow iteration.
It seems that the cause is declaring the Stream repeatedly inside the foreach.
I tried using Baeldung's solution <Supplier> and declared it outside the loop together with parallelStream()
Sample Code:
Supplier<Stream<File>> streamSupplier = () -> imageCollection.parallelStream();
for (Row row : sheet)
{
File IMAGE = streamSupplier.get().filter(e -> FilenameUtils.getBaseName(e.getName()).toLowerCase().equals(FILENAME.toLowerCase())).findFirst().orElse(null);
if (IMAGE != null)
IMAGE_SOURCE = IMAGE.getAbsolutePath();
}
Result went 45000ms
Please correct me if my approach was not right.
How can I convert an image to array of bytes using ImageSharp library?
Can ImageSharp library also suggest/provide RotateMode and FlipMode based on EXIF Orientation?
If you are looking to convert the raw pixels into a byte[] you do the following.
var bytes = image.SavePixelData()
If you are looking to convert the encoded stream as a byte[] (which I suspect is what you are looking for). You do this.
using (var ms = new MemoryStream())
{
image.Save(ms, imageFormat);
return ms.ToArray();
}
For those who look after 2020:
SixLabors seems to like change in naming and adding abstraction layers, so...
Now to get a raw byte data you do the following steps.
Get MemoryGroup of an image using GetPixelMemoryGroup() method.
Converting it into array (because GetPixelMemoryGroup() returns a interface) and taking first element (if somebody tells me why they did that, i'll appreciate).
From System.Memory<TPixel> get a Span and then do stuff in old way.
(i prefer solution from #Majid comment)
So the code looks something line this:
var _IMemoryGroup = image.GetPixelMemoryGroup();
var _MemoryGroup = _IMemoryGroup.ToArray()[0];
var PixelData = MemoryMarshal.AsBytes(_MemoryGroup.Span).ToArray();
ofc you don't have to split this into variables and you can do this in one line of code. I did it just for clarification purposes. This solution only viable as for 06 Sep 2020
I'm newbie in indesign scripting stuffs.So I apologise as I couldn't post my trials.
Objective:
I have an indd document which will have figure caption,label etc. I need to copy content(figure which is editable) from other indd file to this document where related figure label exists.
For example:
sample.indd
Some text
Fig.1.1 caption
some text
I need to copy the content of figure1.indd and paste into the sample.indd document where Fig.1.1 string exists and so on. Now I'm doing it manually. But am supposed to automate this.
So, I need some hint how to acheive it using extendscript?
I have found something like below to do this, but I don't have any clue to develop it further and also am not sure whether this approach is correct to get my result. Pls help me
myDocument=app.open(File("file.indd"),false); //opening a file to get the content without showing.
myDocument.pages.item(0).textFrames.item(0).contents="some text";
//here I could set the content but I don't knw how to get the content
// ?????? Then I have to paste the content into active document.
I found the script for my requirement.
var myDoc = File("sample.indd");//Destination File
var myFigDoc = File("fig.indd");//Figure File
app.open(File(myFigDoc));
app.activeDocument.pageItems.everyItem().select();
app.copy();
app.open(File(myDoc));
app.findGrepPreferences = app.changeGrepPreferences = null;
app.findGrepPreferences.findWhat = "FIG. 1.1 ";//Figure caption text
//app.findGrepPreferences.appliedParagraphStyle = "FigureCaption";//Figure Caption Style
myFinds = app.activeDocument.findGrep();
for(var i=0;i<myFinds.length;i++){
myFinds[i].insertionPoints[0].contents="\r";
myFinds[i].insertionPoints[0].select();
app.paste();
}
app.findGrepPreferences = app.changeGrepPreferences = null;
If acceptable for you, you can place an indesign file as link (place…). So a script could try to catch the "fig…" strings and do the importation.
Have a look at scripts that use finGrep() and place() command.
I'm setting up a spreadsheet for someone else with a form to enter data.
One of the columns is supposed to hold a date. The input date format is like this example: "Jan 26, 2013" (there will be a lot of copy & paste involved to collect data, so changing the format at input step is not a real option).
I need this date column to be sortable, but the spreadsheet doesn't recognize this as a date but simply as a string. (It would recognize "Jan-26-2013", I've tried.)
So I need to reformat the input date.
My question is: how can I do this? I have looked around and google apps script looks like the way to go (though I haven't found a good example of reformatting yet).
Unfortunately my only programming experience is in Python, and of intermediate level. I could do this in Python without a problem, but I don't know any JavaScript.
(My Python approach would be:
splitted = date.split()
newdate = "-".join([splitted[0], splitted[1][:-1], splitted[2]])
return newdate
)
I also don't know how I'd go about linking the script to the spreadsheet - would I attach it to the cell, or the form, or where? And how? Any link to a helpful, understandable tutorial etc. on this point would help greatly.
Any help greatly appreciated!
Edit: Here's the code I ended up with:
//Function to filter unwanted " chars from date entries
function reformatDate() {
var sheet = SpreadsheetApp.getActiveSheet();
var startrow = 2;
var firstcolumn = 6;
var columnspan = 1;
var lastrow = sheet.getLastRow();
var dates = sheet.getRange(startrow, firstcolumn, lastrow, columnspan).getValues();
newdates = []
for(var i in dates){
var mydate = dates[i][0];
try
{
var newdate = mydate.replace(/"/g,'');
}
catch(err)
{
var newdate = mydate
}
newdates.push([newdate]);
}
sheet.getRange(startrow, firstcolumn, lastrow, columnspan).setValues(newdates)
}
For other confused google-script Newbies like me:
attaching the script to the spreadsheet works by creating the script from within the spreadsheet (Tools => Script Editor). Just putting the function in there is enough, you don't seem to need a function call etc.
you select the trigger of the script from the Script Editor (Resources => This Project's Triggers).
Important: the script will only work if there's an empty row at the bottom of the sheet in question!
Just an idea :
If you double click on your date string in the spreadsheet you will see that its real value that makes it a string instead of a date object is this 'Jan 26, 2013 with the ' in front of the string that I didn't add here...(The form does that to allow you to type what you want in the text area, including +322475... for example if it is a phone number, that's a known trick in spreadsheets cells) You could simply make a script that runs on form submit and that removes the ' in the cells, I guess the spreadsheet would do the rest... (I didn't test that so give it a try and consider this as a suggestion).
To remove the ' you can simply use the .replace() method **
var newValue = value.replace(/'/g,'');
here are some links to the relevant documentation : link1 link2
EDIT following your comment :
It could be simpler since the replace doesn't generate an error if no match is found. So you could make it like this :
function reformatDate() {
var sheet = SpreadsheetApp.getActiveSheet();
var dates = sheet.getRange(2, 6, sheet.getLastRow(), 1).getValues();
newdates = []
for(var i in dates){
var mydate = dates[i][0];
var newdate = mydate.replace(/"/g,'');
newdates.push([newdate]);
}
sheet.getRange(2, 6, sheet.getLastRow(), 1).setValues(newdates)
}
Also, you used the " in your code, presumably on purpose... my test showed ' instead. What made you make this choice ?
Solved it, I just had to change the comma to dot and it worked
I'm adding sqlite support to a my Google Chrome extension, to store historical data.
When creating the database, it is required to set the maximum size (I used 5MB, as suggested in many examples)
I'd like to know how much memory I'm really using (for example after adding 1000 records), to have an idea of when the 5MB limit will be reached, and act accordingly.
The Chrome console doesn't reveal such figures.
Thanks.
You can calculate those figures if you wanted to. Basically, the default limit for localStorage and webStorage is 5MB where the name and values are saved as UTF16 therefore it is really half of that which is 2.5 MB in terms of stored characters. In webStorage, you can increase that by adding "unlimited_storage" within the manifest.
Same thing would apply in WebStorage, but you have to go through all tables and figure out how many characters there is per row.
In localStorage You can test that by doing a population script:
var row = 0;
localStorage.clear();
var populator = function () {
localStorage[row] = '';
var x = '';
for (var i = 0; i < (1024 * 100); i++) {
x += 'A';
}
localStorage[row] = x;
row++;
console.log('Populating row: ' + row);
populator();
}
populator();
The above should crash in row 25 for not enough space making it around 2.5MB. You can do the inverse and count how many characters per row and that determines how much space you have.
Another way to do this, is always adding a "payload" and checking the exception if it exists, if it does, then you know your out of space.
try {
localStorage['foo'] = 'SOME_DATA';
} catch (e) {
console.log('LIMIT REACHED! Do something else');
}
Internet Explorer did something called "remainingSpace", but that doesn't work in Chrome/Safari:
http://msdn.microsoft.com/en-us/library/cc197016(v=VS.85).aspx
I'd like to add a suggestion.
If it is a Chrome extension, why not make use of Web SQL storage or Indexed DB?
http://html5doctor.com/introducing-web-sql-databases/
http://hacks.mozilla.org/2010/06/comparing-indexeddb-and-webdatabase/
Source: http://caniuse.com/