Scala and html: download an image (*.jpg, etc) to Hard drive

Scala and html: download an image (*.jpg, etc) to Hard drive - http

I've got a Scala program that downloads and parses html. I got the links to the image files form the html, Now I need to transfer those images to my hard drive. I'm wondering what the best Scala method I should use.
my connection code:
import java.net._
import java.io._
import _root_.java.io.Reader
import org.xml.sax.InputSource
import scala.xml._
def parse(sUrl:String) = {
var url = new URL(sUrl)
var connect = url.openConnection
var sorce:InputSource = new InputSource
var neo = new TagSoupFactoryAdapter //load sUrl
var input = connect.getInputStream
sorce.setByteStream(input)
xml = neo.loadXML(sorce)
input.close
}
My blog

Then you may want to take a look at java2s. Although the solution is in plain Java but you can still modify to Scala syntax to "just use it"

An alternative option is to use the system commands which is much cleaner
import sys.process._
import java.net.URL
import java.io.File
object Downloader {
def start(location: String) : Unit = {
val url = new URL(location)
var path = url match {
case UrlyBurd(protocol, host, port, path) => (if (path == "") "/" else path)
}
path = path.substring(path.lastIndexOf("/") + 1)
url #> new File(path) !!
}
}
object UrlyBurd {
def unapply(in: java.net.URL) = Some((
in.getProtocol,
in.getHost,
in.getPort,
in.getPath
))
}

One way to achieve that is: collect the URLs of the images and ask for them to the server (open a new connection with the image url and store the bytestream in the hard drive)

Related

Xamarin - CachedImage - Access the downloaded file

I am using the CachedImage component of ffimageloading. I have a kind of gallery with a carousel view.
All the images are loaded through an internet URL, they are not local images. I would like to add the image sharing function. But I don't want to download the file again, I would like to know if there is a way to access the file that the CachedImage component already downloaded to be able to reuse it in the share function.

try using MD5Helper
var path = ImageService.Instance.Config.MD5Helper.MD5("https://yourfileUrlOrKey")'

Thanks Jason
I share with you how part of my code is:
var key = ImageService.Instance.Config.MD5Helper.MD5("https://yourfileUrlOrKey");
var imagePath = await ImageService.Instance.Config.DiskCache.GetFilePathAsync(key);
var tempFile = Path.Combine(Path.GetTempPath(), "test.jpg");
if (File.Exists(tempFile))
{
File.Delete(tempFile);
}
File.Copy(imagePath, tempFile);
await Share.RequestAsync(new ShareFileRequest
{
Title = "Test",
File = new ShareFile(tempFile)
});
The temporary file I believe, since the cached file has no extension and the applications do not recognize the type.

reading Word file from POST request in grails

I'm trying to write a Groovy script that will post a Word (docx) file to a REST handler on my grails application.
The request is constructed like so:
import org.apache.http.HttpEntity
import org.apache.http.HttpResponse
import org.apache.http.client.methods.HttpPost
import org.apache.http.entity.mime.MultipartEntity
import org.apache.http.entity.mime.content.FileBody
import org.apache.http.entity.mime.content.StringBody
import org.apache.http.impl.client.DefaultHttpClient
class RestFileUploader {
def sendFile(file, filename) {
def url = 'http://url.of.my.app';
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpPost httppost = new HttpPost(url);
MultipartEntity reqEntity = new MultipartEntity();
FileBody bin = new FileBody(file);
reqEntity.addPart("file", new FileBody((File)file, "application/msword"));
def normalizedFilename = filename.replace(" ", "")
reqEntity.addPart("fileName", new StringBody(normalizedFilename));
httppost.setEntity(reqEntity);
httppost.setHeader('X-File-Size', (String)file.size())
httppost.setHeader('X-File-Name', filename)
httppost.setHeader('Content-Type', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document; charset=utf-8')
println "about to post..."
HttpResponse restResponse = httpclient.execute(httppost);
HttpEntity resEntity = restResponse.getEntity();
def responseXml = resEntity.content.text;
println "posted..."
println restResponse
println resEntity
println responseXml.toString()
return responseXml.toString()
}
}
On the receiving controller, I read in the needed headers from the request, and then try to access the file like so:
def inStream = request.getInputStream()
I end up writing out a corrupted Word file, and from examining the file size and the contents, it looks like my controller is writing out the entire request, rather than just the file.
I've also tried this approach:
def filePart = request.getPart('file')
def inStream = filePart.getInputStream()
In this case I end up with an empty input stream and nothing gets written out.
I feel like I'm missing something simple here. What am I doing wrong?

You will need to make two changes:
Remove the line: httppost.setHeader('Content-Type'.... File upload HTTP POST requests must have content type multipart/form-data (set automatically by HttpClient when you construct a multipart HttpPost)
Change the line: reqEntity.addPart("file", ... to: reqEntity.addPart("file", new
FileBody(file)). Or use one of the other non-deprecated FileBody constructors to specify a valid content type and charset (API link) This assumes that your file method parameter is of type java.io.File -- this isn't clear to me from your snippet.
Then, as dmahapatro suggests, you should be able to read the file with: request.getFile('file')

Need a cq5 example

I am new to Adobe cq5. Went through many online blogs and tutorials but could not get much. Can any one provide a Adobe cq5 application example with detailed explanation that can store and retrieve data in JCR.
Thanks in advance.

Here's a snippet for CQ 5.4 to get you started. It inserts a content page and text (as a parsys) at an arbitrary position in the content hierarchy. The position is supplied by a workflow payload, but you could write something that runs from the command line and use any valid CRX path instead. The advantage of making it a process step is that you get a session established for you, and the navigation to the insert point has been taken care of.
import java.text.SimpleDateFormat;
import java.util.Date;
import javax.jcr.Node;
import javax.jcr.RepositoryException;
import org.apache.sling.jcr.resource.JcrResourceConstants;
import org.apache.felix.scr.annotations.Component;
import org.apache.felix.scr.annotations.Properties;
import org.apache.felix.scr.annotations.Property;
import org.apache.felix.scr.annotations.Service;
import org.osgi.framework.Constants;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.day.cq.workflow.WorkflowException;
import com.day.cq.workflow.WorkflowSession;
import com.day.cq.workflow.exec.WorkItem;
import com.day.cq.workflow.exec.WorkflowData;
import com.day.cq.workflow.exec.WorkflowProcess;
import com.day.cq.workflow.metadata.MetaDataMap;
import com.day.cq.wcm.api.NameConstants;
#Component
#Service
#Properties({
#Property(name = Constants.SERVICE_DESCRIPTION,
value = "Makes a new tree of nodes, subordinate to the payload node, from the content of a file."),
#Property(name = Constants.SERVICE_VENDOR, value = "Acme Coders, LLC"),
#Property(name = "process.label", value = "Make new nodes from file")})
public class PageNodesFromFile implements WorkflowProcess {
private static final Logger log = LoggerFactory.getLogger(PageNodesFromFile.class);
private static final String TYPE_JCR_PATH = "JCR_PATH";
* * *
public void execute(WorkItem workItem, WorkflowSession workflowSession, MetaDataMap args)
throws WorkflowException {
//get the payload
WorkflowData workflowData = workItem.getWorkflowData();
if (!workflowData.getPayloadType().equals(TYPE_JCR_PATH)) {
log.warn("unusable workflow payload type: " + workflowData.getPayloadType());
workflowSession.terminateWorkflow(workItem.getWorkflow());
return;
}
String payloadString = workflowData.getPayload().toString();
//the text to be inserted
String lipsum = "Lorem ipsum...";
//set up some node info
SimpleDateFormat simpleDateFormat = new SimpleDateFormat("d-MMM-yyyy-HH-mm-ss");
String newRootNodeName = "demo-page-" + simpleDateFormat.format(new Date());
SimpleDateFormat simpleDateFormatSpaces = new SimpleDateFormat("d MMM yyyy HH:mm:ss");
String newRootNodeTitle = "Demo page: " + simpleDateFormatSpaces.format(new Date());
//insert the nodes
try {
Node parentNode = (Node) workflowSession.getSession().getItem(payloadString);
Node pageNode = parentNode.addNode(newRootNodeName);
pageNode.setPrimaryType(NameConstants.NT_PAGE); //cq:Page
Node contentNode = pageNode.addNode(Node.JCR_CONTENT); //jcr:content
contentNode.setPrimaryType("cq:PageContent"); //or use MigrationConstants.TYPE_CQ_PAGE_CONTENT
//from com.day.cq.compat.migration
contentNode.setProperty(javax.jcr.Property.JCR_TITLE, newRootNodeTitle); //jcr:title
contentNode.setProperty(NameConstants.PN_TEMPLATE,
"/apps/geometrixx/templates/contentpage"); //cq:template
contentNode.setProperty(JcrResourceConstants.SLING_RESOURCE_TYPE_PROPERTY,
"geometrixx/components/contentpage"); //sling:resourceType
Node parsysNode = contentNode.addNode("par");
parsysNode.setProperty(JcrResourceConstants.SLING_RESOURCE_TYPE_PROPERTY,
"foundation/components/parsys");
Node textNode = parsysNode.addNode("text");
textNode.setProperty(JcrResourceConstants.SLING_RESOURCE_TYPE_PROPERTY,
"foundation/components/text");
textNode.setProperty("text", lipsum);
textNode.setProperty("textIsRich", true);
workflowSession.getSession().save();
}
catch (RepositoryException e) {
log.error(e.toString(), e);
workflowSession.terminateWorkflow(workItem.getWorkflow());
return;
}
}
}
I have posted further details and discussion.
A few other points:
I incorporated a timestamp into the name and title of the content
page to be inserted. That way, you can run many code and test cycles
without cleaning up your repository, and you know which test was the
most recently run. Added bonus: no duplicate file names, no
ambiguity.
Adobe and Day have been inconsistent about providing constants for
property values, node types, and suchlike. I used the constants that
I could find, and used literal strings elsewhere.
I did not fill in properties like the last-modified date. In code for
production I would do so.
I found myself confused by Node.setPrimaryType() and
Node.getPrimaryNodeType(). The two methods are only rough
complements; the setter takes a string but the getter returns a
NodeType with various info inside it.
In my original version of this code, I read the text to be inserted from a file, rather than just using the static string "Lorem ipsum..."
Once you've worked through this example, you should be able to use the Abobe docs to write code that reads data back from the CRX.

If you want to learn how to write a CQ application that can store and query data from the CQ JRC, see this article:
http://scottsdigitalcommunity.blogspot.ca/2013/02/querying-adobe-experience-manager-data.html
This provides a step by step guide and walks you right through the entire processes - including building the OSGi bundle using Maven.
FRom the comments above - I see reference to BND file. You should stay away from CRXDE to create OSGi and use Maven.

Tridion 2009 SP1: Is it possible to publish a .htaccess file?

I am using ISAPI rewrite on a project and would like to know if it is possible to publish a .htaccess file from Tridion?
I have tried creating a Page Template with the .htaccess extension but can't create a page with no name.
Any ideas?
Could I use a C# TBB to change the page name?

I would also choose to use a binary to achieve this, but if you want to manage the htaccess file using text, rather than as a multimedia component, you can push a binary into your package using the following technique:
1) Push the text of the Htaccess file into the package with an accessible name (i.e. Binary_Text)
2) Use code similar to the following to create a text file from the text in the variable and add it to the package
class publishStringItemAsBinary : ITemplate
{
public void Transform(Engine engine, Package package)
{
TemplatingLogger log = TemplatingLogger.GetLogger(typeof(publishStringItemAsBinary));
TemplateUtilities utils = new TemplateUtilities();
System.IO.Stream inputStream = null;
try
{
string strInputName = package.GetValue("InputItem");
string strFileName = package.GetValue("strFileName");
string sg_Destination = package.GetValue("sg_Destination");
string itemComponent = package.GetValue("mm_Component");
inputStream = new MemoryStream(Encoding.UTF8.GetBytes(package.GetValue(strInputName)));
log.Debug("InputObject:" + strInputName);
log.Debug("Filename for binary:" + strFileName);
log.Debug("Destination StructureGroup:" + sg_Destination);
Publication contextPub = utils.getPublicationFromContext(package, engine);
TcmUri uriLocalSG = TemplateUtilities.getLocalUri(new TcmUri(contextPub.Id), new TcmUri(sg_Destination));
TcmUri uriLocalMMComp = TemplateUtilities.getLocalUri(new TcmUri(contextPub.Id), new TcmUri(itemComponent));
StructureGroup sg = (StructureGroup)engine.GetObject(uriLocalSG);
Component comp = (Component)engine.GetObject(uriLocalMMComp);
String sBinaryPath = engine.PublishingContext.RenderedItem.AddBinary(inputStream, strFileName, sg, "nav", comp, "text/xml").Url;
//Put a copy of the path in the package in case you need it
package.PushItem("BinaryPath", package.CreateStringItem(ContentType.Html, sBinaryPath));
}
catch (Exception e)
{
log.Error(e.Message);
}
finally
{
if (inputStream != null)
{
inputStream.Close();
}
}
}
}
I think the code is pretty self explanatory. This publishes a binary of type text/xml, but there should be no issue converting it to do a plain text file.

I think you can use multimedia component to store your .htaccess. Even if you will not be able to upload file without name (Windows limitation), you will be able to change filename later, by modifying BinaryContent.Filename property of multimedia component. You can then publish this component seperately, or use AddBinary method in one of your templates.
There's also a user schema where you can change some other rules: "\Tridion\bin\cm_xml_usr.xsd", but you will not be able to allow empty filenames

How to upload files in flex using PyAMF or PhpAMF? client side, and very little server side help needed

Hy!
I need to upload a group of images using flex with robotlegs.
I need a progress bar to work when image is uploading.
It might upload 1 image or more at the time.
I want to know if uploading byteArray to server and then save the image is too heavy for the server.
In the server side I have a method that is made by pyamf, and looks like this:
.
def upload_image(input):
# here does stuff. I need to be able to get parametters like this
input.list_key
# and here I need some help on how to save the file
Thanks ;)

I had to tackle a similar problem (uploading single photo from Flex to Django) while working on captionmash.com, maybe it can help you. I was using PyAMF for normal messaging but FileReference class had a built in upload method, so I chose the easy way.
Basically system allows you to upload a single file from Flex to Google App Engine, then it uses App Engine's Image API to create thumbnail and also convert image to JPEG, then upload it to S3 bucket. boto library is used for Amazon S3 connection, you can view the whole code of the project here on github.
This code is for single file upload only, but you should be able to do multi-file uploads by creating an array of FileReference objects and calling upload method on all of them.
The code I'm posting here is a bit cleaned up, if you still have problems you should check the repo out.
Client Side (Flex):
private function upload(fileReference:FileReference,
album_id:int,
user_id:int):void{
try {
//500 kb image size
if(fileReference.size > ApplicationConstants.IMAGE_SIZE_LIMIT){
trace("File too big"+fileReference.size);
return;
}
fileReference.addEventListener(Event.COMPLETE,onComplete);
var data:URLVariables = new URLVariables();
var request:URLRequest = new URLRequest(ApplicationConstants.DJANGO_UPLOAD_URL);
request.method = URLRequestMethod.POST;
request.data = data;
fileReference.upload(request,"file");
//Popup indefinite progress bar
} catch (err:Error) {
trace("ERROR: zero-byte file");
}
}
//When upload complete
private function onComplete(evt:Event):void{
fileReference.removeEventListener(Event.COMPLETE,onComplete);
//Do other stuff (remove progress bar etc)
}
Server side (Django on App Engine):
Urls:
urlpatterns = patterns('',
...
(r'^upload/$', receive_file),
...
Views:
def receive_file(request):
uploadService = UploadService()
file = request.FILES['file']
uploadService.receive_single_file(file)
return HttpResponse()
UploadService class
import uuid
from google.appengine.api import images
from boto.s3.connection import S3Connection
from boto.s3.key import Key
import mimetypes
import settings
def receive_single_file(self,file):
uuid_name = str(uuid.uuid4())
content = file.read()
image_jpeg = self.create_jpeg(content)
self.store_in_s3(uuid_name, image_jpeg)
thumbnail = self.create_thumbnail(content)
self.store_in_s3('tn_'+uuid_name, thumbnail)
#Convert image to JPEG (also reduce size)
def create_jpeg(self,content):
img = images.Image(content)
img_jpeg = images.resize(content,img.width,img.height,images.JPEG)
return img_jpeg
#Create thumbnail image using file
def create_thumbnail(self,content):
image = images.resize(content,THUMBNAIL_WIDTH, THUMBNAIL_HEIGHT,images.JPEG)
return image
def store_in_s3(self,filename,content):
conn = S3Connection(settings.ACCESS_KEY, settings.PASS_KEY)
b = conn.get_bucket(BUCKET_NAME)
mime = mimetypes.guess_type(filename)[0]
k = Key(b)
k.key = filename
k.set_metadata("Content-Type", mime)
k.set_contents_from_string(content)
k.set_acl("public-read")