Unicode text from SQLite database seems to be broken - sqlite

I'm using http://hackage.haskell.org/package/sqlite-0.5.2.2 for binding to SQLite database. Inside *.db file there is text in UTF-8 encoding, I can assure this in text editor and sqlite CLI tool.
When connecting to database and retrieving data - the text content is broken. The simple test goes below:
import qualified Database.SQLite as SQL
import Control.Applicative ((<$>))
import System.IO
buildSkypeMessages dbh =
(go <$> (SQL.execStatement dbh "select chatname,author,timestamp,body_xml from messages order by chatname, timestamp")) >>=
writeIt
where
writeIt content = withFile "test.txt" WriteMode (\handle -> mapM_ (\(c:a:t:[]) -> hPutStrLn handle c) content)
go (Left msg) = fail msg
go (Right rows) = map f $ concat rows
where
f' (("chatname",SQL.Text chatname):
("author",SQL.Text author):
("timestamp",SQL.Int timestamp):
r) = ([chatname, author], r)
f xs = let (partEntry, (item:_)) = f' xs
in case item of
("body_xml",SQL.Text v) -> v:partEntry
("body_xml",SQL.Null) -> "":partEntry
escape (_,SQL.Text v) = v
escape (_,SQL.Null) = ""
escape (_,SQL.Int v) = show v
What may be wrong there? Am I missing something with Sqlite or with Haskell I/O and encodings?

Actually the problem was not related to SQLite bindings but to String handling in Haskell. What solved the problem - invoking hSetBinaryMode on handle before putting data on it:
writeIt content = withFile "test.txt" WriteMode (\handle -> hSetBinaryMode handle True >> mapM_ (\(c:a:t:[]) -> hPutStrLn handle c) content)

Related

Usecase of Variants in Purescript/Haskell

Can someone tell me what is the use case of purescript-variants or variants in general
The documentation is very well written but I can't find any real use case scenario for it. Can someone tell how we could use Variants in real world?
Variants are duals of records. While records are sort of extensible ad-hoc product types (consider data T = T Int String vs. type T = { i :: Int, s :: String }), variants can be seen as extensible ad-hoc sum types - e.g. data T = A Int | B String vs. Variant (a :: Int, b :: String)
For example, just as you can write a function that handles a partial record:
fullName :: forall r. { first :: String, last :: String | r } -> String
fullName r = r.first <> " " <> r.last
myFullName = fullName { first: "Fyodor", last: "Soikin", weight: "Too much" }
so too, you can write a function that handles a partial variant:
weight :: forall r. Variant (kilos :: Int, vague :: String | r) -> String
weight =
default "Unknown"
# on _kilos (\n -> show n <> " kg.")
# on _vague (\s -> "Kind of a " <> s)
myWeight = weight (inj _kilos 100) -- "100 kg."
alsoMyWeight = weight (inj _vague "buttload") -- "Kind of a buttload"
But these are, of course, toy examples. For a less toy example, I would imagine something that handles alternatives, but needs to be extensible. Perhaps something like a file parser:
data FileType a = Json | Xml
basicParser :: forall a. FileType a -> String -> Maybe a
basicParser t contents = case t of
Json -> parseJson contents
Xml -> parseXml contents
Say I'm ok using this parser in most case, but in some cases I'd also like to be able to parse YAML. What do I do? I can't "extend" the FileType sum type post-factum, the best I can do is aggregate it in a larger type:
data BetterFileType a = BasicType (FileType a) | Yaml
betterParser :: forall a. BetterFileType a -> String -> Maybe a
betterParser t contents = case t of
BasicType bt -> basicParser bt contents
Yaml -> parseYaml contents
And now whenever I call the "better parser", I have to wrap the file type awkwardly:
result = betterParser (BasicType Json) "[1,2,3]"
Worse: now every consumer has to know the hierarchy of BetterFileType -> FileType, they can't just say "json", they have to know to wrap it in BasicType. Awkward.
But if I used extensible variants for the file type, I could have flattened them nicely:
type FileType r = (json :: String, xml :: String | r)
basicParser :: forall a r. Variant (FileType r) -> Maybe a
basicParser = onMatch { json: parseJson, xml: parseXml } $ default Nothing
----
type BetterFileType r = (yaml :: String | FileType r)
betterParser :: forall a r. Variant (BetterFileType r) -> Maybe a
betterParser = onMatch { yaml: parseYaml } basicParser
Now I can use the naked variant names with either basicParser or betterParser, without knowing to wrap them or not:
r1 = betterParser $ inj _json "[1,2,3]"
r2 = betterParser $ inj _yaml "foo: [1,2,3]"

Clean language: append number in end of file, fwritei doesn't work

I'm trying to write function that receive [String] which are names of files, String which is the name of the files directory and *f. The function will append to each file an integer in the end.
Here is what I got so far:
import StdEnv
import StdFile
import FileManipulation
appendNumInEndOfVmFiles :: [String] String *f -> String
appendNumInEndOfVmFiles [] dirname w = "finished"
appendNumInEndOfVmFiles [x:xs] dirname w
# path = dirname +++ "\\\\" +++ x
# (ok,file,files) = fopen path FAppendText w
# file = fwritei 12 file
# (ok2,_) = fclose file w
= appendNumInEndOfVmFiles xs dirname w
Start w
// 1. Receive name of directory from the user.
# (io,w) = stdio w // open stdio
# io = fwrites "Enter name of directory:\n" io // ask for name
# (name,io) = freadline io // read in name
# name = name % (0, size name - 2) // remove \n from name
# (ok,w) = fclose io w // close stdio
| not ok = abort "Couldn't close stdio" // abort in case of failure
// 2. Get a list of all file names in that directory.
# (dir,w) = getDirectoryContents (RelativePath [PathDown name]) w
# fileList = getNamesOfFilesInDirectory (getEntriesList dir)
= appendNumInEndOfVmFiles (getVmFiles fileList) name w
Assume that getVmFiles is defined in my FileManipulation.dcl file and in the context of this problem name is "myDir" and file list is ["hello.vm","Wiki.vm"]
For some reason, even that I got "finished" message on the screen, the files aren't modified. No matter what kind of integer I give to fopen, even if its FWriteText or FWriteData its still doing nothing... also even if I'm using fwritec or fwrites with characters nothing happened.
What I'm missing here? Thanks a lot!
For some reason, even that I got "finished" message on the screen, the files aren't modified.
This is due to lazy evaluation. In appendNumInEndOfVmFiles, the result of fclose is not used, so fclose is not evaluated. Because of this, fwritei does not need to be evaluated either. You can fix this by adding a guard on ok2:
# (ok2,_) = fclose file w
| not ok2 = abort "fclose failed\n"
= appendNumInEndOfVmFiles xs dirname w
However, the typical way to do this would be to rewrite the function to return a *f instead of a String, so that this unique value is not lost. As long as the result is used, then, the fwritei is evaluated. You can potentially make the *f argument strict (i.e. add a ! in front). This would make sure that it is evaluated before entering the function, so that all lingering file closes have been performed.
There are some more issues with your code:
Here, w is used twice, which is illegal because it is of a strict type. You should use (ok2,w) in the guard to continue with the same environment.
# (ok2,_) = fclose file w
= appendNumInEndOfVmFiles xs dirname w
The appendNumInEndOfVmFiles needs to have a type context | FileSystem f to resolve overloading of fopen and fclose.
Lastly:
... even if its FWriteText or FWriteData ...
Just so you know: the difference would be that the first would write the integer in an ASCII representation whereas the second would write it binary as 4 or 8 bytes (depending on the bitwidth of your system).

How to Dict.get a case insensitive key?

I'd like to grab the Total-Records key of a response.headers HTTP response.
Problem is that in some browser it is as returned by the server Total-Records but in some other, it is in lower case.
I'd like to get the value of the Total-Records header regardless of its case.
How would you do that?
Yet another alternative is to use the find function from elm-community/dict-extra to provide a custom comparison operator:
import Dict.Extra
caseInsensitiveGet : String -> Dict String v -> Maybe v
caseInsensitiveGet key =
let
lowerKey = String.toLower key
in
Dict.Extra.find (\k _ -> String.toLower k == lowerKey)
>> Maybe.map Tuple.second
In the absence of a general-purpose case-insensitive dictionary, you can hack together a manual lookup:
caseInsensitiveGet : String -> Dict String v -> Maybe v
caseInsensitiveGet key dict =
let
lowerKey = String.toLower key
in
Dict.toList dict
|> List.filterMap (\(k, v) ->
if String.toLower k == lowerKey then
Just v
else
Nothing)
|> List.head
Note that this will perform worse the larger the dictionary is. In this form it is possible to have multiple variations of the same text with different cases, so it isn't perfect, but if you're just looking for a header, it should suffice.
I think the least costly way - in general - is simply to do the two look ups and see what you get. For example,
case (Dict.get "Total-Records" response.headers, Dict.get "total-Records" response.headers) of
(Just s, _) ->
s
(_, Just s) ->
s
_ ->
<handle error case>
Alternatively you could replace the above with if then else.
In this situation I think I might go (uses elm-community/list-extra):
getFirstCaseInsensitive : String -> Dict String v -> Maybe v
getFirstCaseInsensitive key dict =
let
key_ =
String.toLower key
in
dict
|> Dict.toList
|> List.Extra.find
((==) key_ << String.toLower << Tuple.first)
|> Maybe.map Tuple.second
Thank you all for your answers, I ended up using Dict.Extra.mapKeys:
import Dict.Extra
headers =
Dict.Extra.mapKeys String.toLower response.headers

Fast impertive pointers (static, unboxing, etc.) with Struct library

I am interested in using more efficient pointers for a project implementing an imperative language in Haskell. There is already a library for that: Struct. There is a blog post on it and brief documentation.
The problem is there is only a quite sophisticated example of linkcut trees. For someone like me who doesn't use Haskell on a daily basis, it is quite exhausting to battle little documented code, template haskell, etc.
I would need a simpler example to get started, along the lines of expressing either of those two data types:
import Data.IORef
data DLL a = DLL a (Maybe (IORef (DLL a))) (Maybe (IORef (DLL a)))
data DLLINT = DLLINT Int (Maybe (IORef DLLINT)) (Maybe (IORef DLLINT))
This should be just a few simple lines for someone who is fluent in Haskell/GHC.
How do I express one of the data types above with the Struct library?
I managed to get your DLL type working with Structs as follows:
{-# LANGUAGE TemplateHaskell, RoleAnnotations #-}
module DubLiList where
import Control.Monad.Primitive
import Data.Struct.TH
import Data.Struct
import Data.Struct.Internal
makeStruct [d|
data DLL a s = DLL
{ prev :: !(DLL a s)
, value :: a
, next :: !(DLL a s)
}
|]
new :: (PrimMonad m) => a -> m (DLL a (PrimState m))
new x = st $ newDLL Nil x Nil
insert :: (PrimMonad m) => a -> DLL a (PrimState m) -> m (DLL a (PrimState m))
insert x this = st $ do
prev' <- get prev this
new <- newDLL prev' x this
set prev this new
set next prev' new
return new
delete :: (PrimMonad m) => DLL a (PrimState m) -> m ()
delete this = st $ do
prev' <- get prev this
next' <- get next this
set next prev' next'
set prev next' prev'
toList :: (PrimMonad m) => DLL a (PrimState m) -> m [a]
toList this = st $ do
if isNil this then return [] else do
x <- getField value this
that <- get next this
(x:) <$> toList that
Here's an example of using it:
main :: IO ()
main = do
dll <- new "foo" -- [foo]
dll' <- insert "bar" dll -- [bar, foo]
insert "baz" dll -- [bar, baz, foo]
xs <- toList dll'
print xs

Collecting the output of an external command using OCaml

What is the right way to call an external command and collect its output in OCaml?
In Python, I can do something like this:
os.popen('cmd').read()
How I can get all of an external program's output in OCaml? Or, better, OCaml with Lwt?
Thanks.
You want Unix.open_process_in, which is described on page 388 of the OCaml system manual, version 3.10.
For Lwt,
val pread : ?env:string array -> command -> string Lwt.t
seems to be a good contender. Documentation here: http://ocsigen.org/docu/1.3.0/Lwt_process.html
let process_output_to_list2 = fun command ->
let chan = Unix.open_process_in command in
let res = ref ([] : string list) in
let rec process_otl_aux () =
let e = input_line chan in
res := e::!res;
process_otl_aux() in
try process_otl_aux ()
with End_of_file ->
let stat = Unix.close_process_in chan in (List.rev !res,stat)
let cmd_to_list command =
let (l,_) = process_output_to_list2 command in l
There are lots of examples on PLEAC.
You can use the third party library Rashell which uses Lwt to define some high-level primitives to read output from processes. These primitives, defined in the module Rashell_Command, are:
exec_utility to read the output of a process as a string;
exec_test to only read the exit status of a process;
exec_query to read the output of a process line by line as a string Lwt_stream.t
exec_filter to use an external program as a string Lwt_stream.t -> string Lwt_stream.t transformation.
The command function is used to create command contexts on which the previous primitives can be applied, it has the signature:
val command : ?workdir:string -> ?env:string array -> string * (string array) -> t
(** [command (program, argv)] prepare a command description with the
given [program] and argument vector [argv]. *)
So for instance
Rashell_Command.(exec_utility ~chomp:true (command("", [| "uname" |])))
is a string Lwt.t which returns the “chomped” string (new line removed) of the “uname” command. As a second example
Rashell_Command.(exec_query (command("", [| "find"; "/home/user"; "-type"; "f"; "-name"; "*.orig" |])))
is a string Lwt_stream.t whose elements are the paths of the file found by the command
find /home/user -type f -name '*.orig'
The Rashell library defines also interfaces to some commonly used commands, and a nice interface to the find command is defined in Rashell_Posix – which by the way guarantees POSIX portability.

Resources