Screen scrape to email with full url for images and css - asp.net

I am screen scraping a webpage and sending it as a html email.
What is the easiest/best way to manipulate the html to set full http addresses for all images and css files?
Current method is similar to (manually typed) + this is very open to error.
string html = rawHtml.replace("=\"", "=\"" + Request["SERVER_NAME"]);
.
.
Here is the current function we use to screen scrape using GET
public static string WebGet(string address)
{
string result = "";
using (WebClient client = new WebClient())
{
using (StreamReader reader = new StreamReader(client.OpenRead(address)))
{
string s = reader.ReadToEnd();
result = s;
}
}
return result;
}

It sounds like what you need is an HTML parser. Once you parse the html string with the parser, you can execute commands that easily manipulate the DOM, and thus you could find all img elements, check their src and append the Request["SERVER_NAME"] if you need to.
I don't code in ASP, but I found this:
http://htmlagilitypack.codeplex.com/
And here is a useful article I found explaining how to use it:
https://web.archive.org/web/20211020001935/https://www.4guysfromrolla.com/articles/011211-1.aspx

Related

Xamarin.Forms - need to get resulting HTML from a page execution (CGI on server)

I'm using Xamarin.Forms and need to:
call an HTML page which is a CGI page executing on the server.
retrieve the resulting page's content.
What is the best method in C# to accomplish this?
string html = string.Empty;
using (var http = new HttpClient(()) {
html = await http.GetStringAsync("http://www.url.com");
}

Inline image rendered twice by OSX mail app

My .NET 4.5 web application uses class SmtpClient to create and send e-mail messages to various recipients.
Each e-mail message consists of:
an HTML message body
an embedded inline image (JPeg or PNG or GIF)
an attachment (PDF)
Sample code is below. It works fine, but there is one gripe from OSX users. Apple's standard mail app renders the image twice; once inlined in the message body, and again following the message body, next to the preview of the PDF attachment.
I tinkered with the following properties; none of which would help.
SmtpClient's DeliveryFormat
MailMessage's IsBodyHtml and BodyTransferEncoding
Attachment's MimeType, Inline, DispositionType, ContentId, FileName, Size, CreationDate, ModificationDate
If I compose a similar e-mail message in MS Outlook and send it off to the Apple user, the image is rendered once, inlined in the message body; exactly as I would like it to be. So apparently it is possible.
After reading this, I inspected the raw MIME data, and noticed Outlook uses multipart/related to group together the message body and the images.
My question:
How do I mimic Outlook's behavior with the classes found in System.Net.Mail?
Things I would rather not do:
Employ external images instead of embedded ones (many e-mail clients initially block these to protect recipient's privacy).
Use third party libraries (to avoid legal hassle). The SmtpDirect class I found here seems to solve the problem (though I got a server exception in return), but it is hard for me to accept a complete rewrite of MS's SmtpClient implementation is necessary for such a subtle change.
Send the e-mail message to a pickup folder, manipulate the resulting .eml file, push the file to our Exchange server.
Minimal code to reproduce the problem:
using System.IO;
using System.Net.Mail;
using System.Net.Mime;
namespace SendMail
{
class Program
{
const string body = "Body text <img src=\"cid:ampersand.gif\" /> image.";
static Attachment CreateGif()
{
var att = new Attachment(new MemoryStream(Resource1.ampersand), "ampersand.gif")
{
ContentId = "ampersand.gif",
ContentType = new ContentType(MediaTypeNames.Image.Gif)
};
att.ContentDisposition.Inline = true;
return att;
}
static Attachment CreatePdf()
{
var att = new Attachment(new MemoryStream(Resource1.Hello), "Hello.pdf")
{
ContentId = "Hello.pdf",
ContentType = new ContentType(MediaTypeNames.Application.Pdf)
};
att.ContentDisposition.Inline = false;
return att;
}
static MailMessage CreateMessage()
{
var msg = new MailMessage(Resource1.from, Resource1.to, "The subject", body)
{
IsBodyHtml = true
};
msg.Attachments.Add(CreateGif());
msg.Attachments.Add(CreatePdf());
return msg;
}
static void Main(string[] args)
{
new SmtpClient(Resource1.host).Send(CreateMessage());
}
}
}
To actually build and run it, you will need an additional resource file Resource1.resx with the two attachments (ampersand and Hello) and three strings host (the SMTP server), from and to (both of which are e-mail addresses).
(I found this solution myself before I got to posting the question, but decided to publish anyway; it may help out others. I am still open for alternative solutions!)
I managed to get the desired effect by using class AlternateView.
static MailMessage CreateMessage()
{
var client = new SmtpClient(Resource1.host);
var msg = new MailMessage(Resource1.from, Resource1.to, "The subject", "Alternative message body in plain text.");
var view = AlternateView.CreateAlternateViewFromString(body, System.Text.Encoding.UTF8, MediaTypeNames.Text.Html);
var res = new LinkedResource(new MemoryStream(Resource1.ampersand), new ContentType(MediaTypeNames.Image.Gif))
{
ContentId = "ampersand.gif"
};
view.LinkedResources.Add(res);
msg.AlternateViews.Add(view);
msg.Attachments.Add(CreatePdf());
return msg;
}
As a side effect, the message now also contains a plain text version of the body (for paranoid web clients that reject HTML). Though it is a bit of a burden ("Alternative message body in plain text" needs improvement), it does give you more control as to how the message is rendered under different security settings.

How do i check whether the url is responsive or not

I have Image Url in my Database and i want to check whether the URL is responsive or not in the browser .
please Help me .
For Example :
http://images.jactravel.co.uk/6008_1_1.jpg
or
http://images.jactravel.co.uk/6049_2_4.jpg
now how can i check automatically this url is responsive or not
I assume that by responsive you mean whether you can get a response when you call a specific URL or not.
To do that without actually downloading the content, you can use the HttpClient.GetAsync(string,HttpCompletionOption) with an HttpCompletionOption of ResponseHeadersRead. This will make GetAsync return immediately with a status code (eg 200, 404 or 500) without waiting to download the entire content, eg:
using (var client = new HttpClient())
{
using(var response = await client.GetAsync("http://mysite/myimage.jpg",
HttpCompletionOption.ResponseHeadersRead))
{
if (response.IsSuccessStatusCode)
{
//The URL is good
}
}
}
To actually read the content, you need to access one of the Read methods of the response's Content property. For example, you can use the CopyToAsync to copy the content to a file stream, or use ReadAsByteArrayAsync to read the content as a byte array, eg:
var buffer=await response.Content.ReadAsByteArrayAsync();

How to read HtmlDocument in ASP.NET?

i have an aspx page,there is textbox to write an url and a button to show some pictures that are in that url.I can load the url's source code to HtmlDocument.but i dont know how to load pictures from that html source code to show that pictures in my page.How can i do that ? Thanks in advance
You need to make the question more clear so that one can give you a specific answer.
HTML is a markup language which means that there are only format tags, there are no pictures embedded in a .html document. There are only links to images that are urls that can be accessed trough some address. In order to get the images you need to get that url.
If your question is how you can get the actual html from a link then refer to the following question. But, since you say that you can get the html, then you need to parse it using Regex or HTML Agility Pack.
Code to get the image:
byte[] imageData = DownloadData(Url); //DownloadData function from here
MemoryStream stream = new MemoryStream(imageData);
Image img = Image.FromStream(stream);
stream.Close();
for method DownloadData you can use WebClient or WebRequest to get the image in a byte array:
WebRequest req = WebRequest.Create("[URL here]");
WebResponse response = req.GetResponse();
Stream stream = response.GetResponseStream();
byte[] b;
using (BinaryReader br = new BinaryReader(stream))
{
b = br.ReadBytes(size);
br.Close();
}
return b;

how to make a picture file downloadable?

I have an ASP.NET MVC3 application and I want to link_to an image file (png, jpeg, gif, etc), and when user clicks on it, the file goes to download, instead of the browser shows it; is there any way to do this?
take your link something like this:
#Html.ActionLink(
"Download Image", // text to show
"Download", // action name
["DownloadManager", // if need, controller]
new { filename = "my-image", fileext = "jpeg" } // file-name and extension
)
and action-method is here:
public FilePathResult Download(string filename, string fileext) {
var basePath = Server.MapPath("~/Contents/Images/");
var fullPath = System.IO.Path.Combine(
basePath, string.Concat(filename.Trim(), '.', fileext.Trim()));
var contentType = GetContentType(fileext);
// The file name to use in the file-download dialog box that is displayed in the browser.
var downloadName = "one-name-for-client-file." + fileext;
return File(fullPath, contentType, downloadName);
}
private string GetContentType(string fileext) {
switch (fileext) {
case "jpg":
case "jpe":
case "jpeg": return "image/jpeg";
case "png": return "image/x-png";
case "gif": return "image/gif";
default: throw new NotSupportedException();
}
}
UPDATE:
in fact, when a file is sending to a browser, this key/value will be generated in http-header:
Content-Disposition: attachment; filename=file-client-name.ext
which file-client-name.ext is the name.extension that you want the file save-as it on client system; for example, if you want to do this in ASP.NET (none mvc), you can create a HttpHandler, write the file-stream to Response, and just add the above key/value to the http-header:
Response.Headers.Add("Content-Disposition", "attachment; filename=" + "file-client-name.ext");
just this, enjoy :D
Well technically your browser is downloading it.
I don't think you can directly link to an image, and have the browser prompt to download.
You could try something where instead of linking directly to the image, you link to a page, which serves up the image in a zip file perhaps - which of course would prompt the download to occur.
Yes, you can.
Now, you'll need to customize this to suit your needs, but I created a FileController that returned files by an identifier (you can easily return by name).
public class FileController : Controller
{
public ActionResult Download(string name)
{
// check the existence of the filename, and load it in to memory
byte[] data = SomeFunctionToReadTheFile(name);
FileContentResult result = new FileContentResult(data, "image/jpg"); // or whatever it is
return result;
}
}
Now, how you read that file or where you get it from is up to you. I then created a route like this:
routes.MapRoute(null, "files/{name}", new { controller = "File", action = "Download"});
My database has a map of identifiers to files (it's actually more complex than this, but I am omitting that logic for brevity), I can write urls like:
"~/files/somefile"
And the relevant file is downloaded.
I don't think this is possible but a simple message saying right click to save image would suffice I think.

Resources