Scrape Highchart, missing data - web-scraping

I've been trying to scrape a specific highchart, using console commands, something in the line off:
data = $('div#graphCont2').highcharts().series[0].data; { console.log(data)}
This code works on the following site, I retrieve all data.
test-hichart1
However, when I rework the code for the graph I intend to scrape (chart, Its the uppermost chart, APX-PSE for all X and Y entries), I miss data. It varies somehow (based on the timestamps, it seems to vary by the selected period), but I only get data from around timestamp 1562284800000 and onwards when the period is set to "all" (thus missing 2/3 of all entries).
I use this code:
data = $('div#stockchart_apx').highcharts().series[0].data; { console.log(data) }
My idea was to use a console.table to get the info I need, though I'm unsure if the table is usable past 999 entries anyway.
Does anyone have an idea of why the readout fluctuates and how I can retrieve all the information?
Thanks!
EDIT~ so, after a couple more hours, I managed to get all data by opening the graph in full-window mode. I'm unsure to where the differences originate from, but it worked. I scraped the data with:
data = $('div#stockchart_apx').highcharts().series[0].data;
const getCircularReplacer1 = () => {
const seen = new WeakSet();
return (key, value) => {
if (typeof value === "object" && value !== null) {
if (seen.has(value)) {
return;
}
seen.add(value);
}
return value;
};
};
JSON.stringify(data, getCircularReplacer1());

Related

AppSpreadsheet (GAS): avoid some problems with sistematic tested data

In my current job with spreadsheet, all inserted data passes through a test, checking if the same value is found on the same index in other sheets. Failing, a caution message is put in the current cell.
//mimimalist algorithm
function safeInsertion(data, row_, col_)
{
let rrow = row_ - 1; //range row
let rcol = col_ - 1; // range col
const active_sheet_name = getActiveSheetName(); // do as the its name suggest
const all_sheets = SpreadsheetApp.getActiveSpreadsheet().getSheets();
//test to evaluate the value to be inserted in the sheet
for (let sh of all_sheets)
{
if (sh.getName() === active_sheet_name)
continue;
//getSheetValues do as its name suggest.
if( getSheetValues(sh)[rrow][rcol] === data )
return "prohibited insertion"
}
return data;
}
// usage (in cell): =safeInsertion("A scarce data", ROW(), COLUMN())
The problems are:
cached values confuse me sometimes. The script or data is changed but not perceived by the sheet itself until renewing manually the cell's content or refreshing all table. Is there any relevant configuration available to this issue?
Sometimes, at loading, a messing result appears. Almost all data are prohibited, for example (originally, all was fine!).
What can I do to obtain a stable sheet using this approach?
PS: The original function does more testing on each data insertion. Those tests consist on counting the frequency in the actual sheet and in all sheets.
EDIT:
In fact, I can't create a stable sheet. For test, a let you a copy of my code with minimal adaptations.
function safelyPut(data, max_onesheet, max_allsheet, row, col)
{
// general initialization
const data_regex = "\^\s*"+data+"\s*$"
const spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
const activesheet = spreadsheet.getActiveSheet();
const active_text_finder = activesheet.createTextFinder(data_regex)
.useRegularExpression(true)
.matchEntireCell(true);
const all_text_finder = spreadsheet.createTextFinder(data_regex)
.useRegularExpression(true)
.matchEntireCell(true);
const all_occurrences = all_text_finder.findAll();
//test general data's environment
const active_freq = active_text_finder.findAll().length;
if (max_onesheet <= active_freq)
return "Too much in a sheet";
const all_freq = all_occurrences.length;
if (max_allsheet <= all_freq)
return "Too much in the work";
//test unicity in a position
const active_sname = activesheet.getName();
for (occurrence of all_occurrences)
{
const sname = occurrence.getSheet().getName();
//if (SYSTEM_SHEETS.includes(sname))
//continue;
if (sname != active_sname)
if (occurrence.getRow() == row && occurrence.getColumn() == col)
if (occurrence.getValue() == data)
{
return `${sname} contains same data with the same indexes.`;
};
}
return data;
}
Create two or three cells and put randomly in a short range short range a value following the usage
=safeInsertion("Scarce Data", 3; 5; ROW(), COLUMN())
Do it, probably you will get a unstable sheet.
About cached values confuse me sometimes. The script is changed but not perceived by the sheet until renewing manually the cell's content or refreshing all table. No relevant configuration available to this issue?, when you want to refresh your custom function of safeInsertion, I thought that this thread might be useful.
About Sometimes, at loading, a messing result appears. Almost all data are prohibited, for example (originally, all was fine!). and What can I do to obtain a stable sheet using this approach?, in this case, for example, how about reducing the process cost of your script? I thought that by reducing the process cost of the script, your situation might be a bit stable.
When your script is modified by reducing the process cost, how about the following modification?
Modified script:
function safeInsertion(data, row_, col_) {
const ss = SpreadsheetApp.getActiveSpreadsheet();
const range = ss.createTextFinder(data).matchEntireCell(true).findNext();
return range && range.getRow() == row_ && range.getColumn() == col_ && range.getSheet().getSheetName() != ss.getActiveSheet().getSheetName() ? "prohibited insertion" : data;
}
The usage of this is the same with your current script like =safeInsertion("A scarce data", ROW(), COLUMN()).
In this modification, TextFinder is used. Because I thought that when the value is searched from all sheets in a Google Spreadsheet, TextFinder is suitable for reducing the process cost.
References:
createTextFinder(findText) of Class Spreadsheet
findNext()

How is the field SourceBaseAmountCur from TmpTaxWorkTrans table computed?

I need to find how is the SourceBaseAmountCur being computed, in my case I am getting an error in Amount Origin on the SST window where it doesn't show 0 when it needs to be.
I am coming from General Ledger > Journals > General Journal > (select a record, going to Lines) > then SST window. Then, the Amount Origin field.
The Amount Origin is a display field:
display TaxBaseCur displaySourceBaseAmountCur(TmpTaxWorkTrans _tmpTaxWorkTrans)
{
return taxTmpWorkTransForm.getSourceBaseAmountCur(_tmpTaxWorkTrans);
}
As seen on the code above, it already passes a TmpTaxWorkTrans record. Going to that method on the class TaxTmpWorkTransForm this is the method:
public TaxAmountCur getSourceBaseAmountCur(TmpTaxWorkTrans _tmpTaxWorkTrans = null, TmpTaxRegulation _tmpTaxRegulation = null)
{
if (_tmpTaxRegulation)
{
return _tmpTaxRegulation.SourceBaseAmountCur;
}
else
{
return _tmpTaxWorkTrans.SourceBaseAmountCur * _tmpTaxWorkTrans.taxChangeDisplaySign(accountTypeMap);
}
}
I found this article: https://dynamicsuser.net/ax/f/technical/92855/how-tmptaxworktrans-populated
and I started from there Class\Tax\insertIntersection and unfortunately I couldn't find what I was looking for, been debugging for days.
An important distinction is tax calculation for a posted vs non-posted journal. It appears you are looking at non-posted journals.
I don't have great data to test this with, but I just hacked this POC job together in 20 minutes, but it should have enough "bits" that you can run with it and get the information you need.
static void Job3(Args _args)
{
TaxCalculation taxCalculation;
LedgerJournalTrans ledgerJournalTrans;
TmpTaxWorkTrans tmpTaxWorkTrans;
TaxAmountCur taxAmountCur;
ledgerJournalTrans = LedgerJournalTrans::findRecId(5637293082, false); // Use your own journal line
// The reason we call the below stuff is `element.getShowTax()` and is called from `\Forms\LedgerJournalTransDaily\Designs\Design\[ActionPane:ActionPane]\[ActionPaneTab:ActionPaneTab]\[ButtonGroup:ButtonGroup]\MenuItemButton:TaxTransSource\Methods\clicked`
// This is from `\Classes\LedgerJournalEngine\getShowTax`
taxCalculation = LedgerJournalTrans::getTaxInstance(ledgerJournalTrans.JournalNum, ledgerJournalTrans.Voucher, ledgerJournalTrans.Invoice, true, null, false, ledgerJournalTrans.TransDate);
taxCalculation.sourceSingleLine(true, false);
// This is from `\Classes\TaxTmpWorkTransForm\initTax`
tmpTaxWorkTrans.setTmpData(taxCalculation.tmpTaxWorkTrans());
// This is the temporary table that is populated
while select tmpTaxWorkTrans
{
// This is from `\Classes\TaxTmpWorkTransForm\getSourceBaseAmountCur`
taxAmountCur = (tmpTaxWorkTrans.SourceTaxAmountCur * tmpTaxWorkTrans.taxChangeDisplaySign(null)); // I pass null because the map doesn't appear used...investigate?
// This just outputs some data
info(strFmt("%1: %2", tmpTaxWorkTrans.TaxCode, taxAmountCur));
}
}

Crossfilter grouping filtered keys

I have some json, for examle:
data = {
"name":"Bob","age":"20",
"name":"Jo","age":"21",
"name":"Jo","age":"22",
"name":"Nick","age":"23"
}
Next, I use crossfilter, create dimension and filter it:
let ndx = crossfilter(data);
let dim = ndx.dimension(d => d.name).filter(d !== "Jo");
//try to get filtered values
let filtered = dim.top(Infinity); // -> return 2 values where 'name'!='Jo'
//"name":"Bob","age":"20"
//"name":"Nick","age":"23"
let myGroup = dim.group(d => {
if(d === 'Jo') {
//Why we come here? This values must be filtered already
}
})
How can I filter my dimension and don't have these values on 'dim.group'?
Not sure what version you are using, but in the current version of Crossfilter, when a new group is created all records are first added to the group and then filtered records are removed. So the group accessor will be run at least once for all records.
Why do we do this? Because for certain types of grouping logic, it is important for the group to "see" a full picture of all records that are in scope.
It is possible that the group accessor is run over all records (even filtered ones) anyway in order to build the group index, but I don't remember.

SQLite storage API Insert statement freezes entire firefox in bootstrapped(Restartless) AddOn

Data to be inserted has just two TEXT columns whose individual length don't even exceed 256.
I initially used executeSimpleSQL since I didn't need to get any results.
It worked for simulataneous inserts of upto 20K smoothly i.e. in the bakground no lag or freezing observed.
However, with 0.1 million I could see horrible freezing during insertion.
So, I tried these two,
Insert in chunks of 500 records - This didn't work well since even for 20K records it showed visible freezing. I didn't even try with 0.1million.
So, I decided to go async and used executeAsync alongwith Bind etc. This also shows visible freezing for just 20K records. This was the whole array being inserted and not in chunks.
var dirs = Cc["#mozilla.org/file/directory_service;1"].
getService(Ci.nsIProperties);
var dbFile = dirs.get("ProfD", Ci.nsIFile);
var dbService = Cc["#mozilla.org/storage/service;1"].
getService(Ci.mozIStorageService);
dbFile.append('mydatabase.sqlite');
var connectDB = dbService.openDatabase(dbFile);
let insertStatement = connectDB.createStatement('INSERT INTO my_table
(my_col_a,my_col_b) VALUES
(:myColumnA,:myColumnB)');
var arraybind = insertStatement.newBindingParamsArray();
for (let i = 0; i < my_data_array.length; i++) {
let params = arraybind.newBindingParams();
// Individual elements of array have csv
my_data_arrayTC = my_data_array[i].split(',');
params.bindByName("myColumnA", my_data_arrayTC[0]);
params.bindByName("myColumnA", my_data_arrayTC[1]);
arraybind.addParams(params);
}
insertStatement.bindParameters(arraybind);
insertStatement.executeAsync({
handleResult: function(aResult) {
console.log('Results are out');
},
handleError: function(aError) {
console.log("Error: " + aError.message);
},
handleCompletion: function(aReason) {
if (aReason != Components.interfaces.mozIStorageStatementCallback.REASON_FINISHED)
console.log("Query canceled or aborted!");
console.log('We are done inserting');
}
});
connectDB.asyncClose(function() {
console.log('[INFO][Write Database] Async - plus domain data');
});
Also, I seem to get the async callbacks after a long time. Usually, executeSimpleSQL is way faster than this.If I use SQLite Manager Tool extension to open the DB immediately this is what I get ( as expected )
SQLiteManager: Error in opening file mydatabase.sqlite - either the file is encrypted or corrupt
Exception Name: NS_ERROR_STORAGE_BUSY
Exception Message: Component returned failure code: 0x80630001 (NS_ERROR_STORAGE_BUSY) [mozIStorageService.openUnsharedDatabase]
My primary objective was to dump data as big as 0.1 million + and then later on perform reads when needed.

$.grep on JSON data in multiple array.fields using wildcards?

First off I have looked through similar looking questions but have not found the exact problem asked or answered, so here goes :
I have a JSON Object which consists of about 900+ posts. Looking like this:
var JsonData = [{"rowNumber":563663,"hasWarning":true,"isInvoiceAccount":true,"phone":"","name":"Romerike AS","address1":"Co/Skanning","address2":"PB 52","attention":"","mobile":"","email":"fakt#bos.no","fax":"","zipCity":"N-1471 Askim","invoiceAccount":"","notes":null,"account":"3","country":"NORGE","salesRep":"4","countryCode":"no"},{"rowNumber":563674,"hasWarning":false,"isInvoiceAccount":true,"phone":"","name":"LILLEHAMMER","address1":"POSTBOKS 110","address2":"","attention":"","mobile":"","email":"","fax":"","zipCity":"N-2605 LILLEHAMMER","invoiceAccount":"","notes":null,"account":"14","country":"NORGE","salesRep":"4","countryCode":"no"},{"rowNumber":563676,"hasWarning":true,"isInvoiceAccount":true,"phone":"63929788","name":"Askim Bil AS","address1":"Postboks 82","address2":"","attention":"","mobile":"","email":"karosseri#nyg.no","fax":"","zipCity":"N-2051 Askim","invoiceAccount":"","notes":null,"account":"16","country":"NORGE","salesRep":"4","countryCode":"no"},{"rowNumber":563686,"hasWarning":false,"isInvoiceAccount":true,"phone":"69826060","name":"KAROSSERI A/S","address1":"POSTBOKS 165","address2":"","attention":"","mobile":"","email":"tkar#online.no","fax":"","zipCity":"N-1860 TRØGSTAD","invoiceAccount":"","notes":null,"account":"26","country":"NORGE","salesRep":"4","countryCode":"no"},{"rowNumber":563690,"hasWarning":false,"isInvoiceAccount":true,"phone":"","name":"AUTOSERVICE A/S","address1":"POSTBOKS 15","address2":"","attention":"","mobile":"","email":"","fax":"","zipCity":"N-2851 LENA","invoiceAccount":"","notes":null,"account":"30","country":"NORGE","salesRep":"4","countryCode":"no"},{"rowNumber":563691,"hasWarning":false,"isInvoiceAccount":false,"phone":"","name":"ØYHUS A/S","address1":"POSTBOKS 321","address2":"","attention":"John Doe","mobile":"","email":"","fax":"","zipCity":"N-2817 GJØVIK","invoiceAccount":"","notes":null,"account":"31","country":"NORGE","salesRep":"4","countryCode":"no"}];
I want to filter these data before I read them into a table using $.grep.
The JSON data have been loaded as an object.
In the HTML page I have a textfield named "filter".
The following code works, but only when I search for an exact match:
var JsonFiltered = $.grep(JsonData, function (element, index) {
return element.zipCity == $('#filter').val();
});
$.each( JsonFiltered, function ( index, value ) {
// sorting through the array adding values to a table
[...]
});
Problem 1:
I want to use Wildcards when filtering.
I read something about using regexp but I haven't found any viable examples.
Problem 2:
I want to be able to filter more than one column.
Example: filtering the word "Askim" in both element.name and element.zipCity
So I figured out the solutions myself...
Using Wildcards:
var search_term = $('#filter').val();
var search = new RegExp(search_term, "i");
var JsonFiltered = $.grep(JsonTest, function (element, index) {
var zipC = search.test(element.zipCity)
var names = search.test(element.name)
return zipC + names ;
The solution was to use "new RegExp" with the filter "i" setting.
then I took two search.tests combined them in the return command and... presto
Hope this helps anyone else.

Resources