I am working on a HPC at the moment, and I have a question regarding the icc compiler.
What I want to do is to have a peek at what is going on when I change the optimisation levels through [O0..O3]. The data I want, regarding vectorization and whether code was folded inline etc., seems to be in the report generated by the -qopt-report flag.
I decided to use the greatest level of verbosity for the report which is
-qopt-report5 (I think this is the correct way to use it)
however, when reducing the O-level, the report gets progressively smaller until becoming empty when using the -O0 flag:
icc -O0 -qopt-report5 -c test1.c
I'll keep looking, but if anyone notices me being brain dead, I'd appreciate a pointer to the use of these flags together!
Thanks in advance for any hints.
Cheers,
MArk.
-qopt-report5 is always disabled when you use -O0.
This is "by definition", because opt-report == "optimization report" and O0 == "no optimization", so there is nothing to report about.
Auto-vectorization is generally speaking enabled starting from O2 optimizaiton level, so if you want to explore vectorization aspects, then you need to use at least "-O2 -opt-report5" combination or "higher".
If you want to correlate performance "peaks" and "optimization report", consider using Intel "Vectorization Advisor" (read more here, download from this location for now: https://software.intel.com/en-us/advisor_getting_started_intro )
Related
I use both GNU Make and - woe is me - ClearCase' clearmake.
Now, GNU make respect a flag named MAKEFLAGS, which for me is set to j20 on this multi-core machine I'm on. Unfortunately, clearmake also recognizes this option, yet doesn't except this value. It tells me:
clearmake: Error: Bad option (j)
clearmake: Error: Bad option (2)
clearmake: Error: Bad option (0)
Questions:
Why is this happening? Should ClearMake accommodate GNU Make's usage?
How can I get around it, other then turning the flag off an on repeatedly?
It's been 15 years or so since I used clearmake, but assuming it doesn't support the GNU make-specific GNUMAKEFLAGS variable you can use:
export GNUMAKEFLAGS=-j20
and leave MAKEFLAGS unset.
The "BUILDING SOFTWARE WITH CLEARCASE" clearly states in its T"unsupported Gnu make features" that this option is indeed not supported.
–j [JOBS]
--jobs=[JOBS]
Maybe a clearmake -C -J can help (for testing): there should then be no limit to the number of parallel builds.
Are you calling GNU make from a clearmake build script? Or are you trying to create a single makefile that will support both build tools? I think the GNUMAKEFLAGS EV is safer for GNU make specific values. I would also use
CCASE_MAKEFLAGS for any makeflags that are specific to clearmake.
CCASE_CONC to set the concurrency value. While clearmake no longer passes the -J in MAKEFLAGS, it used to, and if you're using an older clearmake (somewhere in the 7's as I recall), you could upset "child" GNU make sessions since they like -J about as much as clearmake likes -j.
Finally, check the env_ccase man page for the behavior mentioned in CCASE_MAKEFLAGS_V6_OBSOLETE. If you pass MAKEFLAGS explicitly in the build script like
$(MAKE) $(MAKEFLAGS) TARGET=x
And had started clearmake like this:
clearmake -C gnu TARGET=Y
You'll actually get both TARGET macro definitions in the command line. Setting the mentioned EV (at all) avoids the "pass defined macros in MAKEFLAGS" behavior. The switch exists because some people have makefiles that DEPEND on this behavior, while others have ones BROKEN BY this behavior...
Assuming for the sake of argument that your company has a support agreement with either IBM or HCL, this is a good time to use your support channels to bring up clearmake concerns.
I'm trying to profile an existing application with a quite complicated structure. For now I am using perf_event_open and the needed ioctl calls for enabling the events which are of my interest.
The manpage stays that PERF_COUNT_HW_INSTRUCTIONS should be used carefully - so which one should be preferred in case of a Skylake processor? Maybe a specific Intel PMU?
The perf_event_open manpage http://man7.org/linux/man-pages/man2/perf_event_open.2.html
says about PERF_COUNT_HW_INSTRUCTIONS:
PERF_COUNT_HW_INSTRUCTIONS Retired instructions. Be careful, these can be affected by various issues, most notably hardware interrupt counts.
I think this means that COUNT_HW_INSTRUCTIONS can be used (and it is supported almost everywhere). But exact values of COUNT_HW_INSTRUCTIONS for some code fragment may be slightly different in several runs due to noise from interrupts or another logic.
So it is safe to use events PERF_COUNT_HW_INSTRUCTIONS and PERF_COUNT_HW_CPU_CYCLES on most CPU. perf_events subsystem in Linux kernel will map COUNT_HW_CPU_CYCLES to some raw events more suitable to currently used CPU and its PMU.
Depending on your goals you should try to get some statistics on PERF_COUNT_HW_INSTRUCTIONS values for your code fragment. You can also check stability of this counter with several runs of perf stat with some simple program:
perf stat -e cycles:u,instructions:u /bin/echo 123
perf stat -e cycles:u,instructions:u /bin/echo 123
perf stat -e cycles:u,instructions:u /bin/echo 123
Or use integrated repeat function of perf stat:
perf stat --repeat 10 -e cycles:u,instructions:u /bin/echo 123
I have +-10 instructions events variation (less than 0.1%) for 200 thousands total instructions executed, so it is very stable. For cycles I have 5% variation, so it should be cycles event marked with careful warning.
I have a C++ library and it has a few of C++ static objects. The library could suffer from C++ static initialization fiasco. I'm trying to vet unforeseen translation unit dependencies by randomizing the order of the *.o files during a build.
I visited 2.3 How make Processes a Makefile in the GNU manual and it tells me:
Goals are the targets that make strives ultimately to update. You can override this behavior using the command line (see Arguments to Specify the Goals) ...
I also followed to 9.2 Arguments to Specify the Goals, but a treatment was not provided. It did not surprise me.
Is it possible to have Make randomize its goals? If so, then how do I do it?
If not, are there any alternatives? This is in a test environment, so I have more tools available to me than just GNUmake.
Thanks in advance.
This is really implementation-defined, but GNU Make will process targets from left to right.
Say you have an OBJS variable with the objects you want to randomize, you could write something like (using e.g. shuf):
RAND_OBJS := $(shell shuf -e -- $(OBJS))
random_build: $(RAND_OBJS)
This holds as long as you're not using parallel make (-j option). If you are the order will still be randomized, but it will also depend on number of jobs, system load, current phase of the moon, etc.
Next release of GNU make will have --shuffle mode. It will allow you to execute prerequisites in random order to shake out missing dependencies by running $ make --shuffle.
The feature was recently added in https://savannah.gnu.org/bugs/index.php?62100 and so far is available only in GNU make's git tree.
I've found this dictionary by William Whitaker on the Internet and I like its parsing capabilities. But the output doesn't fit for me.
The issue (challenge for me):
Given an input form such as "audiam", the program returns the following output (plain text):
audi.am V 4 1 PRES ACTIVE SUB 1 S
audi.am V 4 1 FUT ACTIVE IND 1 S
audio, audire, audivi, auditus V (4th) [XXXAO]
hear, listen, accept, agree with; obey; harken, pay attention; be able to hear;
But I just want to receive the following text output (same input: audiam):
audiam=audio, audire, audivi, auditus
That is:
InputWord=Dictionary_Forms
So some pieces of information are needless for me.
How can I change the output of this program by modifying the Ada code?
I don't have any Ada knowledge, but I know Delphi/Pascal so it's quite easy to understand the code, isn't it? So the parts causing the text output seem to be the TEXT_IO.PUT(...) statements, right? They're all called in list_package.adb so this is probably the source file to look at.
What has to be changed in particular?
The full Ada 95 source code of this program is available on this page.
I hope some of you are able to understand Ada 95 code. Thank you very much in advance!
My compiling problems:
For use on a windows machine, I downloaded MinGW and tried to compile the source files using "MinGW Shell". But this was my input and the shell's reponse:
Compiling with the latest Cygwin version:
When I compile the program using the latest version of Cygwin, there is no error message:
There is even an .exe file which is created. Its size is 1.6 MB (1,682,616 bytes). But when I open it, it closes right away. What has gone wrong?
William Whitaker's Words is a handy tool. You may be able to find a version already built for your platform. I've not changed the code, but you can alter some things using various parameters. It's even hosted online. If you get an Ada compiler, I've included the last Makefile I used. It's a little thin on abstraction, but it includes the essential steps to compile the program and utilities, along with the steps to build the dictionaries.
TARG = words
ARGS = -O
$(TARG): *.ad[bs]
gnatmake $(TARG) $(ARGS)
all: $(TARG)
gnatmake makedict $(ARGS)
gnatmake makeinfl $(ARGS)
gnatmake makestem $(ARGS)
gnatmake makeefil $(ARGS)
#echo Please make the dicitionary
#echo ./makedict DICTLINE.GEN
#echo ./makestem STEMLIST.GEN
#echo ./makeefil EWDSLIST.GEN
#echo ./makeinfl INFLECTS.GEN
debug:
gnatmake -g $(TARG)
clean:
rm -f *.o *.ali b~* core
cleaner: clean
rm -f *.s makedict makeinfl makestem makeefil
cleanest: cleaner
rm -f $(TARG)
Addendum: One approach is to use gcc 4.4.3 on Ubuntu 10.04 with the updated Makefile above. For convenience, I used VirtualBox to host the linux instance.
$ gcc --version
gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Here's a quick test using the title of my second favorite passage from Catulli Carmina.
$ ./words odi et amo
odi V 6 1 PRES ACTIVE IMP 2 S
odeo, odire, odivi(ii), - V TRANS [EXXCW] Later
od.i V 4 1 PRES ACTIVE IMP 2 S
odio, odire, odivi, - V (4th) TRANS [FXXCF] Medieval
hate; dislike; be disinclined/reluctant/adverse to; (usu. PREFDEF);
odi N 2 4 GEN S N Early
odium, odi(i) N (2nd) N [XXXAO]
hate/hatred/dislike/antipathy; odium, unpopularity; boredom/impatience;
hatred (manifested by/towards group), hostility; object of hate/odium;
od.i V 3 1 PERF ACTIVE IND 1 S
odi, odisse, osus V (3rd) PERFDEF [XXXBX]
hate (PERF form, PRES force), dislike; be disinclined/reluctant/adverse to;
et CONJ
et CONJ [XXXAX]
and, and even; also, even; (et ... et = both ... and);
am.o V 1 1 PRES ACTIVE IND 1 S
amo, amare, amavi, amatus V (1st) [XXXAO]
love, like; fall in love with; be fond of; have a tendency to;
Addendum: Once you've got it running, the problem of modifying it remains. A grep for Put_Line\( shows 629 hits; most are in line_stuff and list*. That's where I'd start. As you are learning Ada, there are several good tutorials here.
As much as I like Ada and would encourage you to learn the minimal amount it would require to hack it the way you want...
Really, you are asking for a simple data filter, which it would be quite easy to accomplish by piping your output to awk. If you are running on any flavor of Linux you have awk already (and really should learn to use it). If you are on Windows, you can get awk and all sorts of other useful goodies from MinGW, which is one of the places you'd need to go to get an Ada compiler anyway.
If you do want a Windows Ada compiler, I'd suggest getting GNAT/GCC from there. The two other flavors available, GNAT/GPL and GNAT/PRO are available from AdaCore (the maintainers). However, GNAT/PRO must be purchased and GNAT/GPL renders distributions of any program compiled using it GPL. You might not mind the GPL applying to your program I suppose, but I'm guessing this isn't a serious enough need to spring for commercial support.
If you are on Linux, the GNAT Ada compiler should be available with GCC as an option (if not installed by default). The same two other options from AdaCore are available there too of course, if you like.
Well, you asked about learning Ada. Really, if you are familiar with other compiled procedural languages (eg: C/C++, Java, Pascal, Modula-2, etc.) you shouldn't have too much trouble picking it up. This question covers Ada books. For myself, I generally just use the official LRM as a reference. Unlike most languages, Ada has an internationally standardized Language Reference Manual that is available online for free. It is also quite readable, as such things go.
About compiling: you can use GNAT. It supports Ada83, Ada95, and Ada05. To tell it to use Ada95, use the -gnat95 switch.
You can get it on http://libre.adacore.com/
VendorString() doesn't work, it's always Sun Microsystems, even if it is Xorg built for Solaris.
$ xdpyinfo | grep vendor
vendor string: The X.Org Foundation
vendor release number: 10601901
This is xorg-server 1.6.1 on Linux. Hopefully XOrg and XSun on Solaris will differ here.
To output these two fields, xdpyinfo calls the ServerVendor macro to determine the vendor, then parses the return of the VendorRelease macro differently depending on what ServerVendor was.
By the way, what's VendorString()? I don't have a function or macro by that name...
It's possibly a little hacky, but if you look at the list of extensions returned from Xsun and Xorg you should see that Xorg has a few extra XFree86-derived extensions.
xdpyinfo can be used to list the extensions via the command-line to check for differences; programmatically you can use XListExtensions() or XQueryExtension().
(I haven't got a Xsun X Server to hand but I'm pretty sure when I've looked in the past they have differed quite abit).
Thank you!
Oops, VendorRelease() string it is.
Anyway, unfortunately we cannot bet on this string. It changes often enough to have a trouble, for Xsun as well as for Xorg. I have found a solution working (hopefully) for them and for various other (derived) servers like Xvfb, Xnest etc.
Xsun does use a third value in an array of the keysyms for KP_ (numpad) keycodes. Xorg uses 1-st or 2-nd. A sniffer should first, obtain a keycode for a KP_ keysym, for instance XK_KP_7,
second, sniff what is in the XKeycodeToKeysym(d,keycode, [0-3]). Our XK_KP_7 will be on the index 2 for Xsun.