System.Net.WebClient.DownloadStringTaskAsync method when webpage contains ™ or other special characters - asp.net

I am using the System.Net.WebClient.DownloadStringTaskAsync async method to upload a web page content and process it or just save it on my local folder. Everything is fine but when the web page contains some special characters like ™ or ®, they are not getting downloaded. Am I missing something here?
String contentToScrapeURL = "https://www.naylornetwork.com/aaho-advertorial/newsletter.asp?issueID=89542";
Boolean success = true;
using (System.Net.WebClient wc = new System.Net.WebClient())
{
String pageSourceCode = await wc.DownloadStringTaskAsync(contentToScrapeURL);
String path = #"C:\MyProjects\TestingThings\App_Data\" + "test.html";
File.WriteAllText(path, pageSourceCode);
}

Found it, or remembered it.
I did set the System.Net.WebClient.Encoding to Encoding.UTF8
So this below is the updated code
using (System.Net.WebClient wc = new System.Net.WebClient())
{
wc.Encoding = Encoding.UTF8;
String pageSourceCode = await wc.DownloadStringTaskAsync(contentToScrapeURL);
String path = #"C:\MyProjects\TestingThings\App_Data\" + "test.html";
File.WriteAllText(path, pageSourceCode);
}

Related

Alternative Save/OpenBinaryDirect methods for CSOM for SharePoint Online

Based on the doc from MS https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/using-csom-for-dotnet-standard, Save/OpenBinaryDirect methods is not available for .NET core app, they suggest to use regular file API, so what is the alternative way to read/write files stored in SharePoint online? what is the regular file API? does anyone done this? any example code/documentation?
Download file in .NET Core CSOM:
using (var authenticationManager = new AuthenticationManager())
using (var context = authenticationManager.GetContext(site, user, password))
{
context.Load(context.Web, p => p.Title);
context.ExecuteQuery();
Microsoft.SharePoint.Client.File file = context.Web.GetFileByUrl("https://tenant.sharepoint.com/sites/michael/Shared%20Documents/aa.txt");
context.Load(file);
context.ExecuteQuery();
string filepath = #"C:\temp\" + file.Name;
Microsoft.SharePoint.Client.ClientResult<Stream> mstream = file.OpenBinaryStream();
context.ExecuteQuery();
using (var fileStream = new System.IO.FileStream(filepath, System.IO.FileMode.Create))
{
mstream.Value.CopyTo(fileStream);
}
using (System.IO.StreamReader sr = new System.IO.StreamReader(mstream.Value))
{
String line = sr.ReadToEnd();
Console.WriteLine(line);
}
}
Upload file in .NET Core CSOM:
string filepath = #"C:\temp\aa.txt";
FileCreationInformation newfile = new FileCreationInformation();
newfile.Url = System.IO.Path.GetFileName(filepath);
newfile.Content= System.IO.File.ReadAllBytes(filepath);
List library = context.Web.Lists.GetByTitle("Documents");
Microsoft.SharePoint.Client.File uploadFile = library.RootFolder.Files.Add(newfile);
context.Load(uploadFile);
context.ExecuteQuery();
Jerry's answer got me there, but I wanted to add a couple of extras that weren't mentioned in his answer.
If your file destination isn't the main Documents list, instead of the Lists.GetByTitle call use
var folder = context.Web.GetFolderByServerRelativeUrl(...);
File uploadFile = folder.Files.Add(newfile);
If you're updating a file, you've got to set
newFile.Overwrite = true;
And if the file you're uploading/replacing is greater than 2MB, you've got to use the ContentStream instead of Content
FileCreationInformation newfile = new FileCreationInformation
{
Url = relativeUrl,
ContentStream = stream,
Overwrite = true
};

Upload file into S3 with AWS SDK ASP.NET

I am trying to upload an image from ASP.NET to S3. I am using AWS SDK for that and have already set up what is needed. However, after i run my project, i received an error. I'll be replacing my bucket name to ... for this sample code.
I set up my secretkey and accesskey from User in my Web.config. Please do tell me if u need more codes. I need help.
controller
private static readonly string _awsAccessKey = ConfigurationManager.AppSettings["AWSAccessKey"];
private static readonly string _awsSecretKey = ConfigurationManager.AppSettings["AWSSecretKey"];
[HttpPost]  
        public ActionResult UploadFile(HttpPostedFileBase file)  
        {
try  
            {
if (file.ContentLength > 0)
{
IAmazonS3 client;
using (client = Amazon.AWSClientFactory.CreateAmazonS3Client(_awsAccessKey, _awsSecretKey))
{
PutObjectRequest request = new PutObjectRequest
{
BucketName = "...",
CannedACL = S3CannedACL.PublicRead,
Key = "images/" + (DateTime.Now.ToBinary() + "-" + file.FileName),
FilePath = Server.MapPath("~/UploadedFiles")
};
client.PutObject(request);
}
}
imageUrls = "File Uploaded Successfully!!";
System.Diagnostics.Debug.WriteLine("File Uploaded Successfully!!");
return Json(imageUrls);
            }  
            catch  
            {  
                ViewBag.Message = "File upload failed!!";
System.Diagnostics.Debug.WriteLine("File upload failed!!");
return Json(ViewBag.Message);  
            }  
        }
You're getting the error due to DateTime.Now.ToBinary() which contains invalid characters to be used in a URL. For example, you could use a GUID or a Unix timestamp instead.
Also, the FilePath property you're assigning to the PutObjectRequest is the full path and name to a file to be uploaded. So, you don't need it when you already have HttpPostedFileBase as an input parameter, which contains the InputStream property (i.e., the stream object).
Your PutObjectRequest should look something like this:
.
.
.
Guid guid = Guid.NewGuid();
// Create a client
AmazonS3Client client = new AmazonS3Client(_awsAccessKey, _awsSecretKey);
// Create a PutObject request
PutObjectRequest request = new PutObjectRequest
{
BucketName = "...",
CannedACL = S3CannedACL.PublicRead,
Key = "images/" + guid + "-" + file.FileName
};
using (System.IO.Stream inputStream = file.InputStream)
{
request.InputStream = inputStream;
// Put object
PutObjectResponse response = client.PutObject(request);
}
.
.
.
I finally solved it. I realized i did not place region in AWSClientFactory, right at the end after the keys.

Check if HttpContext.Current.Request.Files["filex"] has file

I am working on an ASP .net webpage which receives a form which is posted to it. The posted form has three as well.
filename = uploadFile(HttpContext.Current.Request.Files["file1"], path);
This is the code through which i upload a file to my server. And this is the code of the function.
public string uploadFile(HttpPostedFile file, string dest)
{
string filename = file.FileName;
string path = Server.MapPath(dest);
String extension = Path.GetExtension(file.FileName);
filename = filename.Replace(extension, "");
filename = filename.Replace(".", "");
filename = System.DateTime.Now.ToString("ddMMyyyyhhmmss") + filename + extension;
string savepath = path + "/" + filename;
file.SaveAs(savepath);
return filename;
}
The problem is I am not able to check if file1 from the posted form actually has file. Is it possible?
Use the FileUpload control in conjunction with the HasFile property:
FileUpload.HasFile Property
If you have to do it that way, you can simply check if the ContentLength is greater than zero.

How to get result of BoilerPipe extraction in HTML instead of plain text

I'm using the following code to extract the textual contents from the web pages, my app is hosted on Google App Engine and works exactly like BoilerPipe Web API. The problem is that I can only get the result in plain text format. I played around the library to find a work around, but I couldn't find a method to display the result in HTML. What I am trying to have is to include a option like HTML (extract mode) as in the original BoilerPipe Web API here.
This is the code I'm using for extracting the plain text.
PrintWriter out = response.getWriter();
try {
String urlString = request.getParameter("url");
String listOUtput = request.getParameter("OutputType");
String listExtractor = request.getParameter("ExtractorType");
URL url = new URL(urlString);
switch (listExtractor) {
case "1":
String mainArticle = ArticleExtractor.INSTANCE.getText(url);
out.println(mainArticle);
break;
case "2":
String fullArticle = KeepEverythingExtractor.INSTANCE.getText(url);
out.println(fullArticle);
break;
}
} catch (BoilerpipeProcessingException e) {
out.println("Sorry We Couldn't Scrape the URL you Entered " + e.getLocalizedMessage());
} catch (IOException e) {
out.println("Exception thrown");
}
How can I include the feature for displaying the result in HTML form?
i am using the source code of Boilerpipe, and solve your question with the following code:
String urlString = "your url";
URL url = new URL(urlString);
URI uri = new URI(urlString);
final HTMLDocument htmlDoc = HTMLFetcher.fetch(url);
final BoilerpipeExtractor extractor = CommonExtractors.DEFAULT_EXTRACTOR;
final HTMLHighlighter hh = HTMLHighlighter.newExtractingInstance();
hh.setOutputHighlightOnly(true);
TextDocument doc;
String text = "";
doc = new BoilerpipeSAXInput(htmlDoc.toInputSource()).getTextDocument();
extractor.process(doc);
final InputSource is = htmlDoc.toInputSource();
text = hh.process(doc, is);
System.out.println(text);
Source

401 System.UnauthorizedAccessException when access Dropbox With SharpBox API

The code
config = CloudStorage.GetCloudConfigurationEasy(nSupportedCloudConfigurations.DropBox)
as DropBoxConfiguration;
//config.AuthorizationCallBack = new Uri("http://localhost:61926/DBoxDemo.aspx");
requestToken = DropBoxStorageProviderTools.GetDropBoxRequestToken(config, "KEY", "SECRET");
//Session["requestToken"] = requestToken;
string AuthoriationUrl = DropBoxStorageProviderTools.GetDropBoxAuthorizationUrl(
config, requestToken);
Process.Start(AuthoriationUrl);
accessToken = DropBoxStorageProviderTools.ExchangeDropBoxRequestTokenIntoAccessToken(
config, "xxxxxxxxxxxxx", "xxxxxxxxxxxxx", requestToken);
CloudStorage dropBoxStorage = new CloudStorage();
var storageToken = dropBoxStorage.Open(config, accessToken);
var publicFolder = dropBoxStorage.GetFolder("/");
// upload a testfile from temp directory into public folder of DropBox
String srcFile = Environment.ExpandEnvironmentVariables(#"C:\Test\MyTestFile.txt");
var rep = dropBoxStorage.UploadFile(srcFile, publicFolder);
MessageBox.Show("Uploaded Successfully..");
**dropBoxStorage.DownloadFile("/MyTestFile.txt",
Environment.ExpandEnvironmentVariables("D:\\test"));**
MessageBox.Show("Downloaded Successfully..");
dropBoxStorage.Close();
This is the Error shown in Visual Studio.
SharpBox has a bug that only occurs in .NET 4.5, because the behavior of the class System.Uri has changed from 4.0 to 4.5.
The method GetDownloadFileUrlInternal() in DropBoxStorageProviderService.cs generates an incorrect URL, because it changes a slash in %2f. In .NET 4.0, this URL will be converted correctly back through the System.Uri object in the method GenerateSignedUrl() in OAuthUrlGenerator.cs.
I have changed the method GetDownloadFileUrlInternal() from this...
public static String GetDownloadFileUrlInternal(IStorageProviderSession session, ICloudFileSystemEntry entry)
{
// cast varibales
DropBoxStorageProviderSession dropBoxSession = session as DropBoxStorageProviderSession;
// gather information
String rootToken = GetRootToken(dropBoxSession);
String dropboxPath = GenericHelper.GetResourcePath(entry);
// add all information to url;
String url = GetUrlString(DropBoxUploadDownloadFile, session.ServiceConfiguration) + "/" + rootToken;
if (dropboxPath.Length > 0 && dropboxPath[0] != '/')
url += "/";
url += HttpUtilityEx.UrlEncodeUTF8(dropboxPath);
return url;
}
...to this...
public static String GetDownloadFileUrlInternal(IStorageProviderSession session, ICloudFileSystemEntry entry)
{
// cast varibales
DropBoxStorageProviderSession dropBoxSession = session as DropBoxStorageProviderSession;
// gather information
String rootToken = GetRootToken(dropBoxSession);
// add all information to url;
String url = GetUrlString(DropBoxUploadDownloadFile, session.ServiceConfiguration) + "/" + rootToken;
ICloudFileSystemEntry parent = entry.Parent;
String dropboxPath = HttpUtilityEx.UrlEncodeUTF8(entry.Name);
while(parent != null)
{
dropboxPath = HttpUtilityEx.UrlEncodeUTF8(parent.Name) + "/" + dropboxPath;
parent = parent.Parent;
}
if (dropboxPath.Length > 0 && dropboxPath[0] != '/')
url += "/";
url += dropboxPath;
return url;
}
and currently it works with .NET 4.5. It may exist a better way to fix the problem, but currently no misconduct noticed.

Resources