How to make URI locations from AST to map on a file read - abstract-syntax-tree

In ClaiR it is not (yet) possible to write changes made in the AST back to file.
For this reason, I create a list lrel[int, int, str] changes = []; with startposition and endposition of the substring to remove, and a string with which it needs to be replaced.
When I have a full list of changes I want to make to a source file, I sort the changes and open the file with fb = chars(readFile(f));
make the changes
public list[int] changeCharList(list[int] charList, lrel[int, int, str] changesList) {
int offset = 0;
for (t <- [0 .. size(changesList)]) {
tuple[int startIndex, int endIndex, str changeWithString] change = changesList[t];
int startIndexWithOffset = change.startIndex + offset;
int endIndexWithOffset = change.endIndex + offset;
list[int] changeWithChars = chars(change.changeWithString);
for (i <- [startIndexWithOffset .. endIndexWithOffset]) {
charList = delete(charList, startIndexWithOffset);
}
for (i <- [0 .. size(changeWithChars)]) {
charList = insertAt(charList, startIndexWithOffset + i, changeWithChars[i]);
}
offset += size(changeWithChars) - (change.endIndex - change.startIndex);
}
return charList;
}
and write to file writeFileBytes(f, fb);
This approach works for source files without expanded macros, but it does not work for sources files with expanded macros. In the later case the offsets used in the AST do not map the offsets with the file opened using readFile.
As a workaround I can comment macros before running Rascal and uncomment them after running Rascal. I do not like this.
Is there a way to recalculate the offsets in such a way that the AST offsets map the file read offsets?

Related

Copying a structure to another structure with MFC

I have this structure:
using SPECIAL_EVENT_S = struct tagSpecialEvent
{
COleDateTime datEvent;
CString strEvent;
CString strLocation;
int iSRREventType;
int iSMREventType;
int iForeignLanguageGroupMenuID;
COleDateTime datEventStartTime;
COleDateTime datEventFinishTime;
BOOL bEventAllDay;
BOOL bSetReminder;
int iReminderUnitType;
int iReminderInterval;
int iImageWidthPercent;
CString strImagePath;
CString strTextBeforeImage;
CString strTextAfterImage;
CChristianLifeMinistryDefines::VideoConferenceEventType eType;
};
And I have instances of this structure as pointers in CListBox items. I now have a need to duplicate a structure so that it is a new instance. At the moment I am doing it like this:
auto* psThisEvent = static_cast<SPECIAL_EVENT_S*>(m_lbEvents.GetItemDataPtr(iThisEventIndex));
if (psThisEvent == nullptr)
return;
auto* psNewEvent = new SPECIAL_EVENT_S;
if (psNewEvent == nullptr)
return;
psNewEvent->bEventAllDay = psThisEvent->bEventAllDay;
psNewEvent->bSetReminder = psThisEvent->bSetReminder;
psNewEvent->datEvent = datNewEvent;
psNewEvent->datEventFinishTime = psThisEvent->datEventFinishTime;
psNewEvent->datEventStartTime = psThisEvent->datEventStartTime;
psNewEvent->eType = psThisEvent->eType;
psNewEvent->iForeignLanguageGroupMenuID = psThisEvent->iForeignLanguageGroupMenuID;
psNewEvent->iImageWidthPercent = psThisEvent->iImageWidthPercent;
psNewEvent->iReminderInterval = psThisEvent->iReminderInterval;
psNewEvent->iReminderUnitType = psThisEvent->iReminderUnitType;
psNewEvent->iSMREventType = psThisEvent->iSMREventType;
psNewEvent->iSRREventType = psThisEvent->iSRREventType;
psNewEvent->strEvent = psThisEvent->strEvent;
psNewEvent->strImagePath = psThisEvent->strImagePath;
psNewEvent->strLocation = psThisEvent->strLocation;
psNewEvent->strTextAfterImage = psThisEvent->strTextAfterImage;
psNewEvent->strTextBeforeImage = psThisEvent->strTextBeforeImage;
Is this the right way to go about this? I saw this question but I am not sure if it is safe to use memcpy in this case.
I am not sure if it is safe to use memcpy in this case.
Your doubts are well-founded. The SPECIAL_EVENT_S structure has members that are not trivially copyable (i.e. cannot be properly copied using memcpy). For example, it contains several CString members – a class with embedded data buffers and pointers; thus if the structure is simply copied memory-to-memory, then destroying one structure (the destination) will potentially cause those data buffers of the CString objects in the other structure (the source) to be invalidated. You must call the CString copy constructor for each of those objects. (The same may also be true of the COleDateTime member objects.)
As mentioned in the comments, calling the implicitly-defined copy constructor or copy assignment operator for the SPECIAL_EVENT_S should take care of this; something along the lines of:
*psNewEvent = *psThisEvent;
But, as you have correctly noted, you will then need to explicitly assign the datEvent member after that copy constructor/assignment:
psNewEvent->datEvent = datNewEvent;

Modified function not working as intended without recursion

I have a recursive function which iterates though directory trees listing the file names located in them.
Here is the function:
void WINAPI SearchFile(PSTR Directory)
{
HANDLE hFind;
WIN32_FIND_DATA FindData;
char SearchName[1024],FullPath[1024];
memset(SearchName,0,sizeof(SearchName));
memset(&FindData,0,sizeof(WIN32_FIND_DATA));
sprintf(SearchName,"%s\\*",Directory);
hFind=FindFirstFile(SearchName,&FindData);
if(hFind!=INVALID_HANDLE_VALUE)
{
while(FindNextFile(hFind,&FindData))
{
if(FindData.cFileName[0]=='.')
{
continue;
}
memset(FullPath,0,sizeof(FullPath));
sprintf(FullPath,"%s\\%s",Directory,FindData.cFileName);
if(FindData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
{
MessageBoxA(NULL, FullPath, "Directory", MB_OK);
SearchFile(FullPath);
}
else
{
MessageBoxA(NULL, FullPath, "File", MB_OK);
}
}
FindClose(hFind);
}
}
There are obviously differences between both functions but I don't understand what's making them act differently. Does anyone know why I am having this problem?
for fast understand error need look for line
goto label;
//SearchFile(FullPath);
at this point hFind containing valid data and FindClose(hFind); need be called for it. but after goto label; executed - your overwrite hFind with hFind = FindFirstFile(SearchName, &FindData); - so you already never close original hFind, never can return to iterate folder after such go to sub-folder. this is key point - need save original hFind before go to sub directory and restore it after. when you do recursive function call - this is done auto - because every sub directory in this case enumerated in self stack frame, which have separate hFind. this is native solution use recursion here.
but possible convert recursion to loop here because we call self always from the single place and as result to this single place. so we can not save return address in stack but do unconditional jump (goto) to known place.
then code have some extra errors, you never check for string buffers overflow, why 1024 as max length is hard-coded when file path can be up to 32768 chars, you not check for reparse point as result can enter to infinite loop, use FindFirstFile instead FindFirstFileEx, etc.
correct code for enumerate sub-folder in loop can be next
void DoEnum(PCWSTR pcszRoot)
{
SIZE_T FileNameLength = wcslen(pcszRoot);
// initial check for . and ..
switch (FileNameLength)
{
case 2:
if (pcszRoot[1] != '.') break;
case 1:
if (pcszRoot[0] == '.') return;
}
static const WCHAR mask[] = L"\\*";
WCHAR FileName[MAXSHORT + 1];
if (_countof(FileName) < FileNameLength + _countof(mask))
{
return;
}
ULONG dwError;
HANDLE hFindFile = 0;
WIN32_FIND_DATA FindData{};
enum { MaxDeep = 0x200 };
//++ stack
HANDLE hFindFileV[MaxDeep];
PWSTR pszV[MaxDeep];
char prefix[MaxDeep+1];
//--stack
ULONG Level = MaxDeep;
memset(prefix, '\t', MaxDeep);
prefix[MaxDeep] = 0;
PWSTR psz = FileName;
goto __enter;
__loop:
hFindFile = FindFirstFileEx(FileName, FindExInfoBasic, &FindData, FindExSearchNameMatch, 0, FIND_FIRST_EX_LARGE_FETCH);
if (hFindFile != INVALID_HANDLE_VALUE)
{
do
{
pcszRoot = FindData.cFileName;
// skip . and ..
switch (FileNameLength = wcslen(pcszRoot))
{
case 2:
if (pcszRoot[1] != '.') break;
case 1:
if (pcszRoot[0] == '.') continue;
}
if (FindData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
{
if ((SIZE_T)(FileName + _countof(FileName) - psz) < FileNameLength + _countof(mask))
{
continue;
}
__enter:
memcpy(psz, pcszRoot, (1 + FileNameLength) * sizeof(WCHAR));
if (FindData.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT)
{
DbgPrint("%sreparse point: <%S>\n", prefix + Level, pcszRoot);
}
else
{
if (Level)
{
DbgPrint("%s<%S>\n", prefix + Level, psz);
hFindFileV[--Level] = hFindFile;
pszV[Level] = psz;
memcpy(psz += FileNameLength, mask, sizeof(mask));
psz++;
goto __loop;
__return:
*--psz = 0;
psz = pszV[Level];
hFindFile = hFindFileV[Level++];
DbgPrint("%s</%S>\n", prefix + Level, psz);
}
}
}
else
{
DbgPrint("%s[%u%u] %S\n", prefix + Level, FindData.nFileSizeLow, FindData.nFileSizeHigh, pcszRoot);
}
if (!hFindFile)
{
// top level exit
return ;
}
} while (FindNextFile(hFindFile, &FindData));
if ((dwError = GetLastError()) == ERROR_NO_MORE_FILES)
{
dwError = NOERROR;
}
FindClose(hFindFile);
}
else
{
dwError = GetLastError();
}
if (dwError)
{
DbgPrint("<%S> err = %u\n", FileName, dwError);
}
goto __return;
}
The reason for the difference is actually the confusion brought to you by goto label.If you are using the recursive version, after the recursive execution is completed, it will return to the recursive place to continue execution.
In your code, you continue to execute while (FindNextFile(hFind, &FindData)), but when you use goto label, it will jump out of the original loop and restart the program from the label, which leads to what you said list a single directory tree before ending.
If you modify the modified code to the following iterative version, you can understand why there is such a problem.
void fun()
{
char* Directory = "D:\\test";
HANDLE hFind;
WIN32_FIND_DATA FindData;
char SearchName[1024], FullPath[1024];
char LastName[1024] = "";
while (1)
{
memset(SearchName, 0, sizeof(SearchName));
memset(&FindData, 0, sizeof(WIN32_FIND_DATA));
sprintf(SearchName, "%s\\*", Directory);
if (strcmp(SearchName, LastName) == 0)
{
return;
}
strcpy(LastName, SearchName);
hFind = FindFirstFile(SearchName, &FindData);
if (hFind != INVALID_HANDLE_VALUE)
{
while (FindNextFile(hFind, &FindData))
{
if (FindData.cFileName[0] == '.')
{
continue;
}
memset(FullPath, 0, sizeof(FullPath));
sprintf(FullPath, "%s\\%s", Directory, FindData.cFileName);
if (FindData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
{
MessageBoxA(NULL, Directory, "Directory", MB_OK);
char cArray[1024];
memset(cArray, 0, sizeof(cArray));
sprintf(cArray, "%s", FullPath);
Directory = cArray;
break;
}
else
{
MessageBoxA(NULL, FullPath, "File", MB_OK);
}
}
FindClose(hFind);
}
}
}
So you cannot achieve the same purpose as recursion by using goto, here you can only use recursion. Of course, I have provided a way to traverse directories non-recursively using queues, which is a more scientific way.
One of the key things that you obtain from recursion is a separate set of local variables for each call to the recursive function. When a function calls itself, and in the recursive call modifies local variables, those local-variable changes do not (directly) affect the local variables of the caller. In your original program, this applies to variables hFind, FindData, SearchName, and FullPath.
If you want similar behavior in a non-recursive version of the function then you need to manually preserve the state of your traversal of one level of the tree when you descend to another level. The goto statement doesn't do any such thing -- it just redirects the control flow of your program. Although there are a few good use cases for goto in C, they are uncommon, and yours is not one of them.
There are several ways to implement manually preserving state, but I would suggest
creating a structure type in which to store those data that characterize the state of your traversal of a particular level. Those appear to be only hFind and FindData -- it looks like the other locals don't need to be preserved. Maybe something like this, then:
struct dir_state {
HANDLE hFind;
WIN32_FIND_DATA FindData;
};
Dynamically allocating an array of structures of that type.
unsigned depth_limit = DEFAULT_DEPTH_LIMIT;
struct dir_state *traversal_states
= malloc(depth_limit * sizeof(*traversal_states));
if (traversal_states == NULL) // ... handle allocation error ...
Tracking the depth of your tree traversal, and for each directory you process, using the array element whose index is the relative depth of that directory.
// For example:
traversal_states[depth].hFind
= FindFirstFile(SearchName, &traversal_states[depth].FindData);
// etc.
Remembering the size of the array, so as to be able to reallocate it larger if the traversal descends too deep for its current size.
// For example:
if (depth >= depth_limit) {
depth_limit = depth_limit * 3 / 2;
struct dir_state *temp
= realloc(traversal_states, depth_limit * sizeof(*traversal_states));
if (temp == NULL) {
// handle error, discontinuing traversal
}
traversal_states = temp;
}
Also, use an ordinary for, while, or do loop instead of a backward-jumping goto. There will be a few details to work out to track when to use FindFirstFile and when FindNextFile (which you would still have with goto), but I'm sure you can sort it out.
Details are left as an exercise.
Unless necessary due to memory or processing constraints or infinite recursion tail conditions that would be complication to introduce there really isn't much need to not use recursion here, since it leads to a readable and elegant solution.
I also want to point out that in "modern" C, any solution using a GOTO is likely not a solution you want since they are so often confusing to use and leads to memory issues (we have loops now to make all of that so much simpler).
Instead of the GOTOs I would suggest implementing a stack of the directories. Wrap the printing logic a while or do-while, and as you are iterating over the files add any directories to the stack. At every new iteration pop and walk the directory at the head of the stack. The loop condition just needs to check if the directory stack is empty, before continuing its block.

Metalkit: MTLBuffer and pointers in swift 3

I started with Metalkit and I have a very simple kernel as a test case.
kernel void compute(device float* outData [[ buffer(0) ]])
{
outData[0] = 234.5;
outData[3] = 345.6;
}
This "computed" data is stored in a MTLBuffer.
var buffer : MTLBuffer?
...
buffer = device.makeBuffer(length: MemoryLayout<Float>.size * 5, options: [])
...
commandBuffer.waitUntilCompleted()
At this point the kernel has written some test data to the MTLBuffer.
Question is how I should access that data from my main program?
I get a unsafeMutableRawPointer from buffer.contents(). How do I get a swift array of values that I can use everywhere else (displaying on screen, writing to file,...)?
These snippets work in this very simple app, but I am not sure if they are correct:
let raw = buffer.contents()
let b = raw.bindMemory(to: Float.self, capacity: 5)
print(b.advanced(by: 3).pointee)
let a = raw.assumingMemoryBound(to: Float.self)
print(a.advanced(by: 3).pointee)
let bufferPointer = UnsafeBufferPointer(start: b, count: 5)
let values = Array(bufferPointer)
print(values)
let value = raw.load(fromByteOffset: MemoryLayout<Float>.size * 3, as: Float.self)
print(value)
Both bindMemory and assumingMemoryBound work. Though assumingMemoryBound assumes the underlying bytes are already typed and bindMemory doesn't. I think that one of either should work, but not both. Which one should it be and why?
I use the code presented below to load to arrays, but I can't decide if mine or your version is best.
let count = 16
var array = [Float]()
array.reserveCapacity(count)
for i in 0..<count {
array.append(buffer.contents().load(fromByteOffset: MemoryLayout<Float>.size * i, as: Float.self))
}

Define dictionary in protocol buffer

I'm new to both protocol buffers and C++, so this may be a basic question, but I haven't had any luck finding answers. Basically, I want the functionality of a dictionary defined in my .proto file like an enum. I'm using the protocol buffer to send data, and I want to define units and their respective names. An enum would allow me to define the units, but I don't know how to map the human-readable strings to that.
As an example of what I mean, the .proto file might look something like:
message DataPack {
// obviously not valid, but something like this
dict UnitType {
KmPerHour = "km/h";
MiPerHour = "mph";
}
required int id = 1;
repeated DataPoint pt = 2;
message DataPoint {
required int id = 1;
required int value = 2;
optional UnitType theunit = 3;
}
}
and then have something like to create / handle messages:
// construct
DataPack pack;
pack->set_id(123);
DataPack::DataPoint pt = pack.add_point();
pt->set_id(456);
pt->set_value(789);
pt->set_unit(DataPack::UnitType::KmPerHour);
// read values
DataPack::UnitType theunit = pt.unit();
cout << theunit.name << endl; // print "km/h"
I could just define an enum with the unit names and write a function to map them to strings on the receiving end, but it would make more sense to have them defined in the same spot, and that solution seems too complicated (at least, for someone who has lately been spoiled by the conveniences of Python). Is there an easier way to accomplish this?
You could use custom options to associate a string with each enum member:
https://developers.google.com/protocol-buffers/docs/proto#options
It would look like this in the .proto:
extend google.protobuf.FieldOptions {
optional string name = 12345;
}
enum UnitType {
KmPerHour = 1 [(name) = "km/h"];
MiPerHour = 2 [(name) = "mph"];
}
Beware, though, that some third-party protobuf libraries don't understand these options.
In proto3, it's:
extend google.protobuf.EnumValueOptions {
string name = 12345;
}
enum UnitType {
KM_PER_HOUR = 0 [(name) = "km/h"];
MI_PER_HOUR = 1 [(name) = "mph"];
}
and to access it in Java:
UnitType.KM_PER_HOUR.getValueDescriptor().getOptions().getExtension(MyOuterClass.name);

display records which exist in file2 but not in file1

log file1 contains records of customers(name,id,date) who visited yesterday
log file2 contains records of customers(name,id,date) who visited today
How would you display customers who visited yesterday but not today?
Constraint is: Don't use auxiliary data structure because file contains millions of records. [So, no hashes]
Is there a way to do this using Unix commands ??
an example, but check the man page of comm for the option you want.
comm -2 <(sort -u yesterday) <(sort -u today)
The other tool you can use is diff
diff <(sort -u yesterday) <(sort -u today)
I was personally going for the creating a data structure and records of visits, but, I can see how you'd do it another way too.
In pseudocode, that looks something like python but could be re-written in perl or shell script or ...
import subprocess
import os
for line in fileinput.input(['myfile'])::
# split out data. For the sake of it I'm assuming name\tid\tdate
fields = line.split("\")
id = fields[1]
grepresult = subprocess.Popen("grep \"" + id + "\" file1", shell=True, bufsize=bufsize, stdout=PIPE).stdout
if len(grepresult) == 0:
print fields # it wasn't in field1
That's not perfect, not tested so treat appropriately but it gives you the gist of how you'd use unix commands. That said, as sfussenegger points out C/C++ if that's what you're using should be able to handle pretty large files.
Disclaimer: this is a not so neat solution (repeatedly calling grep) to match the requirements of the question. If I was doing it, I would use C.
Is a customer identified by id? Is it an int or long? If the answer to both questions is yes, an array with 10,000,000 integers shouldn't take more than 10M*4 = 40MB memory - not a big deal on decent hardware. Simply sort and compare them.
btw, sorting an array with 10M random ints takes less than 2 seconds on my machine - again, nothing to be afraid of.
Here's some very simple Java code:
public static void main(final String args[]) throws Exception {
// elements in each log file
int count = 10000000;
// "read" our log file
Random r = new Random();
int[] a1 = new int[count];
int[] a2 = new int[count];
for (int i = 0; i < count; i++) {
a1[i] = Math.abs(r.nextInt());
a2[i] = Math.abs(r.nextInt());
}
// start timer
long start = System.currentTimeMillis();
// sort logs
Arrays.sort(a1);
Arrays.sort(a2);
// counters for each array
int i1 = 0, i2 = 0, i3 = 0;
// initial values
int n1 = a1[0], n2 = a2[0];
// result array
int[] a3 = new int[count];
try {
while (true) {
if (n1 == n2) {
// we found a match, save value if unique and increment counters
if (i3 == 0 || a3[i3-1] != n1) a3[i3++] = n1;
n1 = a1[i1++];
n2 = a2[i2++];
} else if (n1 < n2) {
// n1 is lower, increment counter (next value is higher)
n1 = a1[i1++];
} else {
// n2 is lower, increment counter (next value is higher)
n2 = a2[i2++];
}
}
} catch (ArrayIndexOutOfBoundsException e) {
// don't try this at home - it's not the pretties way to leave the loop!
}
// we found our results
System.out.println(i3 + " commont clients");
System.out.println((System.currentTimeMillis() - start) + "ms");
}
result
// sample output on my machine:
46308 commont clients
3643ms
as you see, quite efficient for 10M records in each log

Resources