LameMP3FileWriter: Unsupported encoding format MuLaw Parameter name: format - wav

Trying to convert a 12 year old wav file to mp3,
8K, 8bit, Mono-channel, Mu-Law format, WAV
and I am getting this error in LameMP3FileWriter line:
LameMP3FileWriter: Unsupported encoding format MuLaw Parameter name: format
static void Main(string[] args)
{
string wavFilePath = #"C:\temp\Message.wav";
string mp3FilePath = #"C:\temp\Message.mp3";
if (!File.Exists(mp3FilePath))
{
byte[] bytearrwav = File.ReadAllBytes(wavFilePath);
byte[] bytearrmp3 = ConvertWavToMp3(bytearrwav);
File.WriteAllBytes(mp3FilePath, bytearrmp3);
}
}
public static byte[] ConvertWavToMp3(byte[] wavFile)
{
try
{
using (var retMs = new MemoryStream())
using (var ms = new MemoryStream(wavFile))
using (var rdr = new WaveFileReader(ms))
using (var wtr = new LameMP3FileWriter(retMs, rdr.WaveFormat, 128))
{
rdr.CopyTo(wtr);
return retMs.ToArray();
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
return null;
}
}
Could anyone show me how to convert this type of wav to mp3?

You need to get your file converted to a more standard format before converting it to MP3. Use WaveFormatConversionStream.CreatePcmStream to go from mu law to linear PCM 16 bit. Then the next challenge will be that LAME probably won't like 8kHz audio, so upsample to at least 16kHz, possibly higher with either another WaveFormatConversionStream or MediaFoundationResampler.

I ended up using SOX and LAME exes instead to convert to usable wav file and then convert to mp3. It turned out to be an easy and effective solution. Here's the main part of the code:
// create temp file
tempFile = this.FolderFromFile(inputFileAndPath) + "tempFile.wav";
// Part 1: Convert mu-Law WAV to floating point WAV
// perform work with no display of console window
// Example: SOX.EXE MachMsg1.wav -e floating-point MsgFloatingPoint.wav
using (this)
{
System.Diagnostics.Process proc = new System.Diagnostics.Process
{
StartInfo = new System.Diagnostics.ProcessStartInfo
{
FileName = soxFileAndPath,
Arguments = inputFileAndPath + " " + "-e floating-point" + " " + tempFile,
UseShellExecute = false,
RedirectStandardOutput = true,
CreateNoWindow = true
}
};
proc.Start();
}
// Part 2: Convert floating point WAV to MP3 using highest quality possible
// perform work with no display of console window
// Example: LAME.EXE -V4 MsgFloatingPoint.wav MsgFloatingPoint.mp3
using (this)
{
System.Diagnostics.Process proc = new System.Diagnostics.Process
{
StartInfo = new System.Diagnostics.ProcessStartInfo
{
FileName = lameFileAndPath,
Arguments = "-V4" + " " + tempFile + " " + outputFileAndPath,
UseShellExecute = false,
RedirectStandardOutput = true,
CreateNoWindow = true
}
};
proc.Start();
}
SOX command line utility:
URL: http://sox.sourceforge.net/
Version: SoX 14.4.2 released on 02/22/2015
LAME compiled command line utility:
URL: http://www.rarewares.org/mp3-lame-bundle.php
Version: LAME 3.99.5 released on 05/22/2014, Bundle compiled with Intel Compiler 14.0.3.
Hydrogenaudio recommended settings
-- Best quality, "archiving"2 : -b 320
CBR 320 is the strongest setting for MP3, with the lowest risk of artifacts. With the exception of a few situations,
quality is rarely better than the highest VBR profiles described below.
-- High quality, HiFi, home or quiet listening : -V0 (avg. 245 kbps) or -V1 (avg. 225 kbps) or -V2 (avg. 190 kbps) or -V3 (avg. 175 kbps).
These settings are considered to produce transparent encoding (transparent = most people can't distinguish the MP3
from the original in an ABX blind test). Audible differences between these presets exist, but are rare.
-- Portable, background noise and low bitrate requirement, small sizes : -V4 (avg. 160 kbps) or -V5 (avg. 130 kbps)
or -V6 (avg. 115 kbps) -V6 produces an "acceptable" quality, while -V4 should be close to perceptual transparency.
-- Very low bitrate, small sizes, eg. for voice, radio, mono encoding : --abr 80 (stereo) or --abr 56 -m m (mono)
For very low bitrates, up to 100kbps, ABR is most often the best solution.

Related

Executable binary file not working after XOR encryption and decyption

I want to encrypt an exe file (file.exe), write the encrypted version to a text file (fileenc.txt) and decrypt the data in the text file back to another exe file (filedec.exe).
file.exe and filedec.exe are the same and are expected to function the same way.
However, when I try to do this the filedec.exe does not work. Error Popup says: "This app cannot run on your PC".
Please what could be the problem?
However, when I just read the file.exe, write to fileenc.txt without encryption or decryption, and then read fileenc.txt and write data to filedec.exe without encryption or decryption, filedec.exe seems to work fine.
Also, when I try encrypting and decrypting a text file with this code, it works fine too.
But when I encrypt and decrypt an exe on the fly, filedec.exe doesn't work.
Please help me out. Thank you everyone.
Here is my full code:
Main();
function Main() {
var arrKey;
arrKey = "encryptionkey";
//Encrypt file.exe and write the encrypted form to file.txt
Crypt( "C:\\...\\file.exe", "C:\\...\\fileenc.txt", arrKey );
//Decrypt the previously encrypted file.txt and write the decrypted form to filedec.exe
Crypt( "C:\\...\\fileenc.txt", "C:\\...\\filedec.exe", arrKey );
//NOTE: file.exe and filedec.exe are expected to work fine when executed
}
function Crypt(fileIn, fileOut, key) {
var fileInRead;
//Read fileIn
var adTypeBinaryRead = 1;
var BinaryStreamRead;
BinaryStreamRead = new ActiveXObject("ADODB.Stream");
BinaryStreamRead.Type = adTypeBinaryRead;
BinaryStreamRead.Open();
BinaryStreamRead.LoadFromFile(fileIn);
fileInRead = BinaryStreamRead.Read();
//Convert fileIn binary data to string
var objRS = new ActiveXObject("ADODB.Recordset");
var DefinedSize = 1024;
var adSaveCreateOverWrite = 2;
var adFldLong = 0x80;
var adVarChar = 201;
var adTypeText = 2;
objRS.Fields.Append("filedata", adVarChar, DefinedSize, adFldLong);
objRS.Open();
objRS.AddNew();
objRS.Fields("filedata").AppendChunk(fileInRead);
var binString = objRS("filedata").value;
objRS.close();
//Make key as long as string version of fileIn
while (key.length < binString.length) {
key += key;
}
key = key;
//crypt converted string with key
var k, ss, q;
var cryptresult = "";
i = 0;
for (var index = 0; index < binString.length; index++) {
k = key.substr(i, 1);
q = binString.substr(i, 1);
ss = q.charCodeAt(0);
cryptresult = cryptresult + String.fromCharCode(q.charCodeAt(0) ^ k.charCodeAt(0));
i = i +1;
}
// write crypted string to file
var outStreamW = new ActiveXObject("ADODB.Stream");
outStreamW.Type = adTypeText;
// Charset: the default value seems to be `UTF-16` (BOM `0xFFFE` for text files)
outStreamW.Open();
outStreamW.WriteText(cryptresult);
outStreamW.Position = 0;
var outStreamA = new ActiveXObject("ADODB.Stream");
outStreamA.Type = adTypeText;
outStreamA.Charset = "windows-1252"; // important, see `cdoCharset Module Constants`
outStreamA.Open();
outStreamW.CopyTo(outStreamA); // convert encoding
outStreamA.SaveToFile(fileOut, adSaveCreateOverWrite);
outStreamW.Close();
outStreamA.Close();
}
EDIT:
More troubleshooting into my code shows that when I encrypt and decrypt file.exe ON THE FLY, and then write the decrypted data to fileenc.exe, fileenc.exe works well.
But when I encrypt file.exe and write the encrypted data to fileenc.txt and then read the fileenc.txt, decrypt the read encrypted data and write to fileenc.exe (just like in my code), fileenc.exe gets corrupted. My understanding suggests that the manner through which I write the encrypted data to fileenc.txt could be the problem here.
Please I need help, how do I go about with this.

How to get Binary data with javascript using FileReader api with correct encoding

I got a mp4 data using FileReader api, but I have a problem at encoding!
With this function,
var reader = new FileReader();
var blob = new Blob([this.response], {type : "video/mp4"});
reader.onload= function (evt) {
mp4text = evt.target.result;
mp4text = mp4text.toString()
//mp4text = mp4text.slice(22);
//mp4text = CryptoJS.AES.encrypt(mp4text, "test");
//mp4text = window.atob(mp4text);
var myBlob = new Blob([evt.target.result], {type : "video/mp4"});//NOT SAME contrast to blob!
var downloadUrl = URL.createObjectURL(myBlob);
document.getElementById('myVideo').src = downloadUrl;
}
reader.readAsBinaryString(blob);
I thought myBlob has same filedata as blob but some data changed! With more detail, Many of character are same but some hex code is different. How can I solve this problem?
Strings in JavaScript cannot represent arbitrary binary data, so doing readAsBinaryString may not be what you think.
What readAsBinaryString does is for each source byte it gives you a destination character(I don't which character encoding it uses off the top of my head).
So if you have a utf-8 character say ✔, then readAsBinaryString will give you â since that character is tree bytes long %E2%9C%94.
If you try to turn this back to binary/blob the string â is treated as utf-8 which is not 3 bytes but 7(%C3%A2%C5%93%E2%80%9D)
My suggestion would be to use readAsArrayBuffer, I'm sure CryptoJS supports arraybuffers.

Extract text with iText not works: encoding or crypted text?

I have a pdf file that as the follow security properties: printing: allowed; document assembly: NOT allowed; content copy: allowed; content copy for accessibility: allowed; page extraction:NOT allowed;
I try to get text with sample code as documentation sample as follow:
pdftext.Text = null;
StringBuilder text = new StringBuilder();
PdfReader pdfReader = new PdfReader(filename);
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
text.Append(System.Environment.NewLine);
text.Append("\n Page Number:" + page);
text.Append(System.Environment.NewLine);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
text.Append(currentText);
progressBar1.Value++;
}
pdftext.Text += text.ToString();
pdfReader.Close();
but the output text is lines with ""??? ? ???????\n?? ??? ? " values;
seems that file is crypted or we have a encoding problem...
note that in the follow lines
var f = pdfReader.IsOpenedWithFullPermissions; -> FALSE
var f1 = pdfReader.IsEncrypted(); - > FALSE
var f2 = pdfReader.ComputeUserPassword(); - > NULL
var f3 = pdfReader.Is128Key(); - > FALSE
var f4 = pdfReader.HasUsageRights();
f, f1, f3, f4 return FALSE ...than seems that the document is not crypted,
...so I don't know if is a Encoding problem or question related to encrypet strings...
Someone can help me?
thanks in advance.
G.G.
Whenever you have trouble extracting text from a document using standard code, the first thing to do is try and copy&paste the text from it using Adobe Acrobat Reader. Adobe Reader copy&paste implements text extraction according to the recommendations of the PDF specification, and if this fails, this usually means that the necessary information required for text extraction in the document are either missing or broken (by accident or by design). To extract the text, one either needs to customize the code specifically to the specific PDF or resort to OCR.
In case of the document at hand, Adobe Reader copy&paste does result in garbage, too, just like when extracting with iText. Thus, there is something fishy in the document.
Inspecting the document one finds that the fonts contain ToUnicode mappings like this:
/CIDInit /ProcSet
findresource begin 12 dict begin begincmap /CIDSystemInfo<</Registry(Adobe)
/Ordering(Identity)
/Supplement 0
>>
def
/CMapName/F18 def
1 begincodespacerange <0000> <FFFF> endcodespacerange
44 beginbfrange
<20> <20> <0020>
<21> <21> <E0F9>
<22> <22> <E0F1>
<23> <23> <E0FA>
<24> <24> <E0F7>
<25> <25> <E0A3>
<26> <26> <E084>
<27> <27> <E097>
<28> <28> <E098>
<29> <29> <E09A>
<2A> <2A> <E08A>
<2B> <2B> <E099>
<2C> <2C> <E0A5>
<2D> <2D> <E086>
<2E> <2E> <E094>
<2F> <2F> <E0DE>
<30> <30> <E0A6>
<31> <31> <E096>
<32> <32> <E088>
<33> <33> <E082>
<34> <34> <E04C>
<35> <35> <E0A4>
<36> <36> <E0F6>
<37> <37> <E0F2>
<38> <38> <E0D8>
<39> <39> <E0AA>
<3A> <3A> <E06C>
<3B> <3B> <E087>
<3C> <3C> <E095>
<3D> <3D> <E0C4>
<3E> <3E> <E07E>
<3F> <3F> <E055>
<40> <40> <E089>
<41> <41> <E085>
<42> <42> <E083>
<43> <43> <E070>
<44> <44> <E0E6>
<45> <45> <E080>
<46> <46> <E0C8>
<47> <47> <E0F4>
<48> <48> <E062>
<49> <49> <E0F3>
<4A> <4A> <E04E>
<4B> <4B> <E05E>
endbfrange
endcmap CMapName currentdict /CMap defineresource pop end end
I.e., if you are not into this, the fonts claim that all their glyphs (with the exception of the space glyph at 0x20) represent characters U+E0xx from the Unicode private use area. As the name of that area indicates, there is no common meaning of characters with these values.
Thus, text extraction according to the PDF specification will return strings of characters with undefined meaning with results as you observed in iText or I saw in Adobe Reader.
Sometimes in such a situation one can still enforce proper text extraction by ignoring the ToUnicode map and using either the font Encoding or information inside the embedded font program.
Unfortunately it turns out that here the Encoding effectively contains the same information as does the ToUnicode map, e.g. for the same font as above
/Differences [ 32 /space /uniE0F9 /uniE0F1 /uniE0FA /uniE0F7 /uniE0A3 /uniE084 /uniE097 /uniE098
/uniE09A /uniE08A /uniE099 /uniE0A5 /uniE086 /uniE094 /uniE0DE /uniE0A6 /uniE096
/uniE088 /uniE082 /uniE04C /uniE0A4 /uniE0F6 /uniE0F2 /uniE0D8 /uniE0AA /uniE06C
/uniE087 /uniE095 /uniE0C4 /uniE07E /uniE055 /uniE089 /uniE085 /uniE083 /uniE070
/uniE0E6 /uniE080 /uniE0C8 /uniE0F4 /uniE062 /uniE0F3 /uniE04E /uniE05E ]
and the fonts turns out to be Type3 fonts, i.e. there is no embedded font program but each glyph is defined as an individual PDF canvas without further character information.
Thus, nothing to gain here either.
Actually these small PDF canvasses contain inlined bitmap graphics of the respective glyph which also is the cause of the poor graphical quality of the document (if you don't see that immediately, simply zoom in a bit and you'll see the ragged outlines of the glyphs).
By the way, such a construct usually means that the producer of the PDF explicitly wants to prevent text extraction.
If you happen to have to extract text from many such documents, you can try and determine a mapping from their U+E0xx characters to actually sensible Unicode characters and apply that mapping to your extracted text.
If all those fonts in all those documents happen to use the same U+E0xx codepoints for the same actual characters, you'll be able to do text extraction from those documents after investing a certain amount of initial work.
Otherwise do try OCR.
The following code adds pages to a document which map the ToUnicode values to the characters shown:
void AddFontsTo(PdfReader reader, PdfStamper stamper)
{
int documentPages = reader.NumberOfPages;
for (int page = 1; page <= documentPages; page++)
{
// ignore inherited resources for now
PdfDictionary pageResources = reader.GetPageResources(page);
if (pageResources == null)
continue;
PdfDictionary pageFonts = pageResources.GetAsDict(PdfName.FONT);
if (pageFonts == null || pageFonts.Size == 0)
continue;
List<BaseFont> fonts = new List<BaseFont>();
List<string> fontNames = new List<string>();
HashSet<char> chars = new HashSet<char>();
foreach (PdfName key in pageFonts.Keys)
{
PdfIndirectReference fontReference = pageFonts.GetAsIndirectObject(key);
if (fontReference == null)
continue;
DocumentFont font = (DocumentFont) BaseFont.CreateFont((PRIndirectReference)fontReference);
if (font == null)
continue;
PdfObject toUni = PdfReader.GetPdfObjectRelease(font.FontDictionary.Get(PdfName.TOUNICODE));
CMapToUnicode toUnicodeCmap = null;
if (toUni is PRStream)
{
try
{
byte[] touni = PdfReader.GetStreamBytes((PRStream)toUni);
CidLocationFromByte lb = new CidLocationFromByte(touni);
toUnicodeCmap = new CMapToUnicode();
CMapParserEx.ParseCid("", toUnicodeCmap, lb);
}
catch
{
toUnicodeCmap = null;
}
}
if (toUnicodeCmap == null)
continue;
ICollection<int> mapValues = toUnicodeCmap.CreateDirectMapping().Values;
if (mapValues.Count == 0)
continue;
fonts.Add(font);
fontNames.Add(key.ToString());
foreach (int value in mapValues)
chars.Add((char)value);
}
if (fonts.Count == 0 || chars.Count == 0)
continue;
Rectangle size = (fonts.Count > 10) ? PageSize.A4.Rotate() : PageSize.A4;
PdfPTable table = new PdfPTable(fonts.Count + 1);
table.AddCell("Page " + page);
foreach (String name in fontNames)
{
table.AddCell(name);
}
table.HeaderRows = 1;
float[] widths = new float[fonts.Count + 1];
widths[0] = 2;
for (int i = 1; i <= fonts.Count; i++)
widths[i] = 1;
table.SetWidths(widths);
table.WidthPercentage = 100;
List<char> charList = new List<char>(chars);
charList.Sort();
foreach (char character in charList)
{
table.AddCell(((int)character).ToString("X4"));
foreach (BaseFont font in fonts)
{
table.AddCell(new PdfPCell(new Phrase(character.ToString(), new Font(font))));
}
}
stamper.InsertPage(reader.NumberOfPages + 1, size);
ColumnText columnText = new ColumnText(stamper.GetUnderContent(reader.NumberOfPages));
columnText.AddElement(table);
columnText.SetSimpleColumn(size);
while ((ColumnText.NO_MORE_TEXT & columnText.Go(false)) == 0)
{
stamper.InsertPage(reader.NumberOfPages + 1, size);
columnText.Canvas = stamper.GetUnderContent(reader.NumberOfPages);
columnText.SetSimpleColumn(size);
}
}
}
I applied it to your document like this:
string input = #"4700198773.pdf";
string output = #"4700198773-fonts.pdf";
using (PdfReader reader = new PdfReader(input))
using (FileStream stream = new FileStream(output, FileMode.Create, FileAccess.Write))
using (PdfStamper stamper = new PdfStamper(reader, stream))
{
AddFontsTo(reader, stamper);
}
The additional pages look like this:
Now you have to compare the outputs for the different fonts and pages of this document with each other and with those of a representative selection of file. If you find good enough a pattern, you can try this replacement way.

Failing to write offset data to zookeeper in kafka-storm

I was setting up a storm cluster to calculate real time trending and other statistics, however I have some problems introducing the "recovery" feature into this project, by allowing the offset that was last read by the kafka-spout (the source code for kafka-spout comes from https://github.com/apache/incubator-storm/tree/master/external/storm-kafka) to be remembered. I start my kafka-spout in this way:
BrokerHosts zkHost = new ZkHosts("localhost:2181");
SpoutConfig kafkaConfig = new SpoutConfig(zkHost, "test", "", "test");
kafkaConfig.forceFromStart = false;
KafkaSpout kafkaSpout = new KafkaSpout(kafkaConfig);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("test" + "spout", kafkaSpout, ESConfig.spoutParallelism);
The default settings should be doing this, but I think it is not doing so in my case, every time I start my project, the PartitionManager tries to look for the file with the offsets, then nothing is found:
2014-06-25 11:57:08 INFO PartitionManager:73 - Read partition information from: /storm/partition_1 --> null
2014-06-25 11:57:08 INFO PartitionManager:86 - No partition information found, using configuration to determine offset
Then it starts reading from the latest possible offset. Which is okay if my project never fails, but not exactly what I wanted.
I also looked a bit more into the PartitionManager class which uses Zkstate class to write the offsets, from this code snippet:
PartitionManeger
public void commit() {
long lastCompletedOffset = lastCompletedOffset();
if (_committedTo != lastCompletedOffset) {
LOG.debug("Writing last completed offset (" + lastCompletedOffset + ") to ZK for " + _partition + " for topology: " + _topologyInstanceId);
Map<Object, Object> data = (Map<Object, Object>) ImmutableMap.builder()
.put("topology", ImmutableMap.of("id", _topologyInstanceId,
"name", _stormConf.get(Config.TOPOLOGY_NAME)))
.put("offset", lastCompletedOffset)
.put("partition", _partition.partition)
.put("broker", ImmutableMap.of("host", _partition.host.host,
"port", _partition.host.port))
.put("topic", _spoutConfig.topic).build();
_state.writeJSON(committedPath(), data);
_committedTo = lastCompletedOffset;
LOG.debug("Wrote last completed offset (" + lastCompletedOffset + ") to ZK for " + _partition + " for topology: " + _topologyInstanceId);
} else {
LOG.debug("No new offset for " + _partition + " for topology: " + _topologyInstanceId);
}
}
ZkState
public void writeBytes(String path, byte[] bytes) {
try {
if (_curator.checkExists().forPath(path) == null) {
_curator.create()
.creatingParentsIfNeeded()
.withMode(CreateMode.PERSISTENT)
.forPath(path, bytes);
} else {
_curator.setData().forPath(path, bytes);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
I could see that for the first message, the writeBytes method gets into the if block and tries to create a path, then for the second message it goes into the else block, which seems to be ok. But when I start the project again, the same message as mentioned above shows up. No partition information can be found.
I had the same problem. Turned out I was running in local mode which uses an in memory zookeeper and not the zookeeper that Kafka is using.
To make sure that KafkaSpout doesn't use Storm's ZooKeeper for the ZkState that stores the offset, you need to set the SpoutConfig.zkServers, SpoutConfig.zkPort, and SpoutConfig.zkRoot in addition to the ZkHosts. For example
import org.apache.zookeeper.client.ConnectStringParser;
import storm.kafka.SpoutConfig;
import storm.kafka.ZkHosts;
import storm.kafka.KeyValueSchemeAsMultiScheme;
...
final ConnectStringParser connectStringParser = new ConnectStringParser(zkConnectStr);
final List<InetSocketAddress> serverInetAddresses = connectStringParser.getServerAddresses();
final List<String> serverAddresses = new ArrayList<>(serverInetAddresses.size());
final Integer zkPort = serverInetAddresses.get(0).getPort();
for (InetSocketAddress serverInetAddress : serverInetAddresses) {
serverAddresses.add(serverInetAddress.getHostName());
}
final ZkHosts zkHosts = new ZkHosts(zkConnectStr);
zkHosts.brokerZkPath = kafkaZnode + zkHosts.brokerZkPath;
final SpoutConfig spoutConfig = new SpoutConfig(zkHosts, inputTopic, kafkaZnode, kafkaConsumerGroup);
spoutConfig.scheme = new KeyValueSchemeAsMultiScheme(inputKafkaKeyValueScheme);
spoutConfig.zkServers = serverAddresses;
spoutConfig.zkPort = zkPort;
spoutConfig.zkRoot = kafkaZnode;
I think you are hitting this bug:
https://community.hortonworks.com/questions/66524/closedchannelexception-kafka-spout-cannot-read-kaf.html
And the comment from the colleague above fixed my issue. I added some newer libraries to.

Batch for downloading most recent file (where filename changes on new version) from http website

i need a batch for downloading files from a http website (http://www.rarlab.com/download.htm).
From this website i only need the most recent version for the 32bit and 64bit english
program which is always listed at the top of this website.
Problem 1: There are more than this two files for download on the website
Problem 2: The name of the file changes with every new version
How can i download these 2 files (the most recent version) without knowing the exact file name
(and without first visiting the web page to find out the file name) ??
Maybe i can use wget, curl or aria2 for that task but i don't know the parameters/options.
Can anyone help me solving this problem ?
(Please only batch solutions - no vbs, java, jscript, powershell etc.)
thank you.
Sorry, i forgot to say that i use windows 7 32bit. And i prefer batch because the script should be able to run on all windows versions without having to download extra programs or resource kits for different windows version (as of powershell which must be downloaded for windows xp etc.) - and because i only understand batch scripting.
Here's a batch + JScript hybrid script. I know you said no vbs, java, jscript, etc, but you're going to have an awfully hard time scraping HTML with pure batch. But this does meet your other criteria -- running on all Windows versions without having to rely on optional software (like powershell or .Net).* And with JScript's XMLHTTP object you don't even need a 3rd party app to fetch web content.
As for not understanding JScript, aside from a few proprietary ActiveX objects it's just like JavaScript. In case you aren't familiar with JavaScript or regular expressions, I added copious amounts of comments to help you out. Hopefully whatever I didn't bother commenting is pretty obvious what it does.
Update
The script now detects the system locale, matches it with a language on the WinRAR download page, and downloads that language release.
Anyway, save this with a .bat extension and run it as you would any other batch script.
#if (#a==#b) #end /*
:: batch script portion
#echo off
setlocal
set "url=http://www.rarlab.com/download.htm"
set /p "savepath=Location to save? [%cd%] "
if "%savepath%"=="" set "savepath=%cd%"
cscript /nologo /e:jscript "%~f0" "%url%" "%savepath%"
goto :EOF
:: JScript portion */
// populate translation from locale identifier hex value to WinRAR language label
// http://msdn.microsoft.com/en-us/library/dd318693.aspx
var abbrev={}, a=function(arr,val){for(var i=0;i<arr.length;i++)abbrev[arr[i]]=val};
a(['1401','3c01','0c01','0801','2001','4001','2801','1c01','3801','2401'],'Arabic');
a(['042b'],'Armenian');
a(['082c','042c'],'Azerbaijani');
a(['0423'],'Belarusian');
a(['0402'],'Bulgarian');
a(['0403'],'Catalan');
a(['7c04'],'Chinese Traditional');
a(['0c04','1404','1004','0004'],'Chinese Simplified');
a(['101a'],'Croatian');
a(['0405'],'Czech');
a(['0406'],'Danish');
a(['0813','0413'],'Dutch');
a(['0425'],'Estonian');
a(['040b'],'Finnish');
a(['080c','0c0c','040c','140c','180c','100c'],'French');
a(['0437'],'Georgian');
a(['0c07','0407','1407','1007','0807'],'German');
a(['0408'],'Greek');
a(['040d'],'Hebrew');
a(['040e'],'Hungarian');
a(['0421'],'Indonesian');
a(['0410','0810'],'Italian');
a(['0411'],'Japanese');
a(['0412'],'Korean');
a(['0427'],'Lithuanian');
a(['042f'],'Macedonian');
a(['0414','0814'],'Norwegian');
a(['0429'],'Persian');
a(['0415'],'Polish');
a(['0816'],'Portuguese');
a(['0416'],'Portuguese Brazilian');
a(['0418'],'Romanian');
a(['0419'],'Russian');
a(['7c1a','1c1a','0c1a'],'Serbian Cyrillic');
a(['181a','081a'],'Serbian Latin');
a(['041b'],'Slovak');
a(['0424'],'Slovenian');
a(['2c0a','400a','340a','240a','140a','1c0a','300a','440a','100a','480a','080a','4c0a','180a','3c0a','280a','500a','0c0a','040a','540a','380a','200a'],'Spanish');
a(['081d','041d'],'Swedish');
a(['041e'],'Thai');
a(['041f'],'Turkish');
a(['0422'],'Ukranian');
a(['0843','0443'],'Uzbek');
a(['0803'],'Valencian');
a(['042a'],'Vietnamese');
function language() {
var os = GetObject('winmgmts:').ExecQuery('select Locale from Win32_OperatingSystem');
var locale = new Enumerator(os).item().Locale;
// default to English if locale is not in abbrev{}
return abbrev[locale.toLowerCase()] || 'English';
}
function fetch(url) {
var xObj = new ActiveXObject("Microsoft.XMLHTTP");
xObj.open("GET",url,true);
xObj.setRequestHeader('User-Agent','XMLHTTP/1.0');
xObj.send('');
while (xObj.readyState != 4) WSH.Sleep(50);
return(xObj);
}
function save(xObj, file) {
var stream = new ActiveXObject("ADODB.Stream");
with (stream) {
type = 1; // binary
open();
write(xObj.responseBody);
saveToFile(file, 2); // overwrite
close();
}
}
// fetch the initial web page
var x = fetch(WSH.Arguments(0));
// make HTML response all one line
var html = x.responseText.split(/\r?\n/).join('');
// create array of hrefs matching *.exe where the link text contains system language
var r = new RegExp('<a\\s*href="[^"]+\\.exe(?=[^\\/]+' + language() + ')', 'g');
var anchors = html.match(r)
// use only the first two
for (var i=0; i<2; i++) {
// use only the stuff after the quotation mark to the end
var dl = '' + /[^"]+$/.exec(anchors[i]);
// if the location is a relative path, prepend the domain
if (dl.substring(0,1) == '/') dl = /.+:\/\/[^\/]+/.exec(WSH.Arguments(0)) + dl;
// target is path\filename
var target=WSH.Arguments(1) + '\\' + /[^\/]+$/.exec(dl)
// echo without a new line
WSH.StdOut.Write('Saving ' + target + '... ');
// fetch file and save it
save(fetch(dl), target);
WSH.Echo('Done.');
}
Update 2
Here's the same script with a few minor tweaks to have it also detect the architecture (32/64-bitness) of Windows, and only download one installer instead of two:
#if (#a==#b) #end /*
:: batch script portion
#echo off
setlocal
set "url=http://www.rarlab.com/download.htm"
set /p "savepath=Location to save? [%cd%] "
if "%savepath%"=="" set "savepath=%cd%"
cscript /nologo /e:jscript "%~f0" "%url%" "%savepath%"
goto :EOF
:: JScript portion */
// populate translation from locale identifier hex value to WinRAR language label
// http://msdn.microsoft.com/en-us/library/dd318693.aspx
var abbrev={}, a=function(arr,val){for(var i=0;i<arr.length;i++)abbrev[arr[i]]=val};
a(['1401','3c01','0c01','0801','2001','4001','2801','1c01','3801','2401'],'Arabic');
a(['042b'],'Armenian');
a(['082c','042c'],'Azerbaijani');
a(['0423'],'Belarusian');
a(['0402'],'Bulgarian');
a(['0403'],'Catalan');
a(['7c04'],'Chinese Traditional');
a(['0c04','1404','1004','0004'],'Chinese Simplified');
a(['101a'],'Croatian');
a(['0405'],'Czech');
a(['0406'],'Danish');
a(['0813','0413'],'Dutch');
a(['0425'],'Estonian');
a(['040b'],'Finnish');
a(['080c','0c0c','040c','140c','180c','100c'],'French');
a(['0437'],'Georgian');
a(['0c07','0407','1407','1007','0807'],'German');
a(['0408'],'Greek');
a(['040d'],'Hebrew');
a(['040e'],'Hungarian');
a(['0421'],'Indonesian');
a(['0410','0810'],'Italian');
a(['0411'],'Japanese');
a(['0412'],'Korean');
a(['0427'],'Lithuanian');
a(['042f'],'Macedonian');
a(['0414','0814'],'Norwegian');
a(['0429'],'Persian');
a(['0415'],'Polish');
a(['0816'],'Portuguese');
a(['0416'],'Portuguese Brazilian');
a(['0418'],'Romanian');
a(['0419'],'Russian');
a(['7c1a','1c1a','0c1a'],'Serbian Cyrillic');
a(['181a','081a'],'Serbian Latin');
a(['041b'],'Slovak');
a(['0424'],'Slovenian');
a(['2c0a','400a','340a','240a','140a','1c0a','300a','440a','100a','480a','080a','4c0a','180a','3c0a','280a','500a','0c0a','040a','540a','380a','200a'],'Spanish');
a(['081d','041d'],'Swedish');
a(['041e'],'Thai');
a(['041f'],'Turkish');
a(['0422'],'Ukranian');
a(['0843','0443'],'Uzbek');
a(['0803'],'Valencian');
a(['042a'],'Vietnamese');
function language() {
var os = GetObject('winmgmts:').ExecQuery('select Locale from Win32_OperatingSystem');
var locale = new Enumerator(os).item().Locale;
// default to English if locale is not in abbrev{}
return abbrev[locale.toLowerCase()] || 'English';
}
function fetch(url) {
var xObj = new ActiveXObject("Microsoft.XMLHTTP");
xObj.open("GET",url,true);
xObj.setRequestHeader('User-Agent','XMLHTTP/1.0');
xObj.send('');
while (xObj.readyState != 4) WSH.Sleep(50);
return(xObj);
}
function save(xObj, file) {
var stream = new ActiveXObject("ADODB.Stream");
with (stream) {
type = 1; // binary
open();
write(xObj.responseBody);
saveToFile(file, 2); // overwrite
close();
}
}
// fetch the initial web page
var x = fetch(WSH.Arguments(0));
// make HTML response all one line
var html = x.responseText.split(/\r?\n/).join('');
// get OS architecture (This method is much faster than the Win32_Processor.AddressWidth method)
var os = GetObject('winmgmts:').ExecQuery('select OSArchitecture from Win32_OperatingSystem');
var arch = /\d+/.exec(new Enumerator(os).item().OSArchitecture) * 1;
// get link matching *.exe where the link text contains system language and architecture
var r = new RegExp('<a\\s*href="[^"]+\\.exe(?=[^\\/]+' + language() + '[^<]+' + arch + '\\Wbit)');
var link = r.exec(html)
// use only the stuff after the quotation mark to the end
var dl = '' + /[^"]+$/.exec(link);
// if the location is a relative path, prepend the domain
if (dl.substring(0,1) == '/') dl = /.+:\/\/[^\/]+/.exec(WSH.Arguments(0)) + dl;
// target is path\filename
var target=WSH.Arguments(1) + '\\' + /[^\/]+$/.exec(dl)
// echo without a new line
WSH.StdOut.Write('Saving ' + target + '... ');
// fetch file and save it
save(fetch(dl), target);
WSH.Echo('Done.');

Resources