How to test whether an uploaded Excel file meets requirements - r

Background
Suppose I have a shiny app where the user can upload an Excel file. The users will have access to a certain Excel template and I want to make sure that only copies of this template are uploaded.
My current approach
My current approach is now as follows:
Check if sheet name xyz is present -> if not throw an error
Read data from sheet xyz, compare column names with requirements -> if missing columns throw an error
Repeat for all necessary sheets
Problem with the current approach
This requires a lot of hard coding required sheet names and required column names and becomes tedious.
Question
So my question: how can I assure that the user provides a valid file? What strategies do you usually use to make sure that the uploaded file can be properly processed by your apps?
Pseudo Code
library(shiny)
library(tidyverse)
ui <- fluidPage(fileInput("file", "Upload Excel"))
server <- function(input, output, session) {
observe({
req(input$file)
sheet1 <- tryCatch(read_xlsx(input$file$datapath, sheet = "xyz"),
error = function(e) {
## do some sort of error handling, e.g. write to a reactiveValue list
})
if (!all(.REQUIRED_FIELDS_FOR_XYZ %in% names(sheet1))) {
## signal error
}
})
}

If you are already using Excel, why not use a Macro to do the work for you. Consider listing file paths, checking format types, cell addresses, cell values, etc. The Macro below will do most of the heavy lifting for you.
Sub GetFolder_Data_Collection()
Dim colFiles As Collection, c As Range
Dim strPath As String, f, sht As Worksheet
Dim wbSrc As Workbook, wsSrc As Worksheet
Dim rw As Range
Dim sh As Worksheet, flg As Boolean
Set sht = ActiveSheet
strPath = ThisWorkbook.Path
Set colFiles = GetFileMatches(strPath, "*.xlsx", True)
With sht
.Range("A:I").ClearContents
.Range("A1").Resize(1, 5).Value = Array("Name", "Path", "Cell", "Value", "Numberformat")
Set rw = .Rows(2)
End With
For Each f In colFiles
Set wbSrc = Workbooks.Open(f)
Set wsSrc = wbSrc.Sheets(1)
For Each c In wsSrc.Range(wsSrc.Range("A1"), _
wsSrc.Cells(1, Columns.Count).End(xlToLeft)).Cells
rw.Cells(2).Value = wbSrc.Path
sht.Hyperlinks.Add Anchor:=rw.Cells(1), Address:=wbSrc.Path, TextToDisplay:=wbSrc.Name
rw.Cells(3).Value = c.Address(False, False)
rw.Cells(4).Value = c.Value
rw.Cells(5).Value = c.NumberFormat
i = 6
For Each sh In Worksheets
If sh.Name Like "Sheet1*" Or sh.Name Like "*Sheet2*" Then rw.Cells(i).Value = sh.Name & " Exists"
i = i + 1
Next
Set rw = rw.Offset(1, 0)
Next c
wbSrc.Close False
Next f
End Sub
'Return a collection of file objects given a starting folder and a file pattern
' e.g. "*.txt"
'Pass False for last parameter if don't want to check subfolders
Function GetFileMatches(startFolder As String, filePattern As String, _
Optional subFolders As Boolean = True) As Collection
Dim fso, fldr, f, subFldr
Dim colFiles As New Collection
Dim colSub As New Collection
Set fso = CreateObject("scripting.filesystemobject")
colSub.Add startFolder
Do While colSub.Count > 0
Set fldr = fso.GetFolder(colSub(1))
colSub.Remove 1
For Each f In fldr.Files
If UCase(f.Name) Like UCase(filePattern) Then colFiles.Add f
Next f
If subFolders Then
For Each subFldr In fldr.subFolders
colSub.Add subFldr.Path
Next subFldr
End If
Loop
Set GetFileMatches = colFiles
End Function
PUT THIS CODE IN AN XLSB OR XLSM EXCEL FILE IN THE SAME FOLDER AS YOUR EXCEL FILES.
It's probably just easier to do this kind of thing with Excel, and I'm a huge proponent of using the right tool for the job.

Related

How to copy workbook (.aspx file) from html link to current workbook

I have trouble with the following tasks in excel VBA:
At my work, we use a document management platform called TeamShare: [https://www.lector.dk/en/products/]
I want to create a code in VBA, that loops over a range of links to this document management platform in my workbook, ie. loops over other workbooks, opens them and then copies a specified sheet to my current workbook.
I have tried putting together bits of codes from other sites, and the code works just fine when i run it in break mode. However, when I run the code all at once, the Excel program reopens, such that the current workbook cannot "communicate" with the opened workbook and I end up in an infinity loop (so no direct error message).
This is the code that only works in break mode:
Dim wbCopyTo As Workbook Dim wsCopyTo As Worksheet Dim i As Long Dim Count As Long Dim WBCount As Long Dim LastRow As Long Dim wb As Workbook Dim ws As Worksheet Dim URL As String Dim IE As Object Dim doc As Object Dim objElement As Object Dim objCollection As Object
Set wbCopyTo = ActiveWorkbook Set wsCopyTo = ActiveSheet
LastRow = wsCopyTo.Range("B" & Rows.Count).End(xlUp).Row
For i = 2 To LastRow
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
The purpose of this piece of code is to get the DocID
A = InStr(wsCopyTo.Range("B" & i), "documentid=") + Len("documentid=")
B = InStrRev(wsCopyTo.Range("B" & i), "&")
DocID = Mid(wsCopyTo.Range("B" & i), A, B - A)
'Get URL
URL = wsCopyTo.Range("B" & i)
'Count number of open workbooks
WBCount = Workbooks.Count
With IE
New is the comman that opens excel sheet. This works as planned in breakmode, however the excel program reopens when i run the code all at once. I have tried other commandos here: "Workbooks.Open", I couldn't get this one to open the file and "Application.FollowHyperlink" only worked in break mode too, however, much much slower
.Navigate URL
'This was my solution to how to stop the rest of the code from executing until the new workbook has loaded.
Do Until Workbooks.Count = WBCount + 1: Loop
End With
'Unload IE
Set IE = Nothing
Set objElement = Nothing
Set objCollection = Nothing
'So in order to activate the workbook from the URL, I am looping over all my open workbooks and matching them on their unique Document ID. I found that the workbook from the URL wasn´t the "active workbook" per default.
For Each book In Workbooks If Mid(book.Name, 12, Len(DocID)) = DocID Then
book.Activate
Set wb = ActiveWorkbook
Set ws = ActiveSheet
End If
Next book
Here i copy the desired sheet to my initial workbook
wb.Worksheets("SpecificSheetIWantToCopy").Copy After:=wbCopyTo.Worksheets("Sheet1") wbCopyTo.Sheets(ActiveSheet.Name).Name = DocID
Next i
End Sub
I am using excel 2010.
I hope you can help me resolve this problem. Please ask if you need any more information, that i haven´t provided.
Thanks in advance.

Rename file name based on upload date

I write the below code to copy the file and rename file name but the problem that i have now that i need to pick the last file (based on upload date) then rename the file , the below code change all files placed in the folder regardless the upload date , also if there is a simple code to upload file check if file is exist then show message (successful upload , failed upload (duplicate file))
Dim directory = Server.MapPath("App_Data/text/")
For Each filename As String In IO.Directory.GetFiles(directory, "*", IO.SearchOption.AllDirectories)
Dim fName As String = IO.Path.GetFileName(filename)
If fName.ToString Like "*Cust*" Then
System.IO.File.Delete(Server.MapPath("App_Data\test\Customer.txt"))
My.Computer.FileSystem.CopyFile(Server.MapPath("App_Data\text\" & fName), Server.MapPath("App_Data\test\" & fName))
My.Computer.FileSystem.RenameFile(Server.MapPath("App_Data\test\" & fName), "Customer.txt")
you can use below code and find creation date and last modified date of a file:
Dim creation as DateTime = File.GetCreationTime(#"C:\test.txt")
Dim modification as DateTime = File.GetLastWriteTime(#"C:\test.txt")
or by Importing System.IO and using this code:
Dim fi as FileInfo = new FileInfo("path")
Dim created = fi.CreationTime
Dim lastmodified = fi.LastWriteTime
i think the second one is better because you can put them in a collection easily and then sort them or compare them.

Rename multiple files in one directory

Is it possible to rename all the files in a folder with a simple program using vb.NET
Lets say there is a folder containing the files, files are vary every time loaded:
A123.txt
B456.txt
C567.txt
Will it be possible to rename these files in one operation,as below:
A_1.txt
B_2.txt
B_3.txt
This could be simplified but here is the long version with comments every step of the way
Imports System.IO
' get your filenames to be renamed
Dim filenames = Directory.GetFiles("c:\path\with\files")
' order them by whatever rule you choose (here it is by modification date)
Dim orderedFilenames = filenames.OrderBy(Function(s) File.GetLastWriteTime(s))
' rename the files into a dictionary<oldname, newname>
' will result in filename X_Y.Z where X is first letter, Y is order index, Z is extension
Dim newNameSelector =
Function(s As String, i As Integer)
Return New KeyValuePair(Of String, String)(s,
Path.Combine(Path.GetDirectoryName(s), $"{Path.GetFileName(s)(0)}_{i}{Path.GetExtension(s)}"))
End Function
' get new names using the newNameSelector function
Dim newNames = orderedFilenames.Select(newNameSelector)
' define the operation to rename the file using a keyvaluepair<oldname, newname>
Dim renameMethod = Sub(kvp As KeyValuePair(Of String, String)) File.Move(kvp.Key, kvp.Value)
' either sequential rename ...
For Each newName In newNames
renameMethod(newName)
Next
' ... or it could be multi threaded
Parallel.ForEach(newNames, renameOperation)
Notes:
File.Exists check is unnecessary based on the rename rule using the file index. If you change the rule it may be necessary. The rule is ambiguous in your question body. This assumes it's only run once.
Dictionary keys will be unique since Windows won't have duplicate filenames
To answer your question, Will it be possible to rename these files in one operation[?], no, but we can do it in one "line", with multiple threads, even. Just modify the path string "c:\path\with\files" and run this code
Parallel.ForEach(Directory.GetFiles("c:\path\with\files").OrderBy(Function(s) File.GetLastWriteTime(s)).Select(Function(s, i) New KeyValuePair(Of String, String)(s, Path.Combine(Path.GetDirectoryName(s), $"{Path.GetFileName(s)(0)}_{i}{Path.GetExtension(s)}"))), Sub(kvp) File.Move(kvp.Key, kvp.Value))
I think you may need to work out your renaming logic. That is why I had separated it in the long code sample. Just work on the logic inside newNameSelector.
If you want to order by name, simply
Dim orderedFilenames = filenames.OrderBy(Function(s) s)
This will remove "_20190607" from your filename
Dim newNameSelector =
Function(s As String)
Return New KeyValuePair(Of String, String)(s,
Path.Combine(Path.GetDirectoryName(s), String.Format("{0}{1}", Path.GetFileName(s).Split("_"c)(0), Path.GetExtension(s))))
End Function
I know this answer is getting hughmungus, but OP keeps providing new info, which means the question is morphing, thus so is the answer. According to his most recent comment, only the earliest of each prefix should be renamed.
' group the filenames into groups using key: "prefix_"
Dim groupedFilenames = Directory.GetFiles("c:\path\with\files").GroupBy(Function(s) Path.GetFileName(s).Split("_"c)(0))
Dim filenames As New List(Of String)()
' add the min from each group, using the name as the comparer
For Each g In groupedFilenames
filenames.Add(g.Min(Function(s) s))
Next
' rename the files into a dictionary<oldname, newname>
' will result in filename X_Y.Z where X is first letter, Y is order index, Z is extension
Dim newNameSelector =
Function(s As String)
Return New KeyValuePair(Of String, String)(s,
Path.Combine(Path.GetDirectoryName(s), String.Format("{0}{1}", Path.GetFileName(s).Split("_"c)(0), Path.GetExtension(s))))
End Function
' get new names using the newNameSelector function
Dim newNames = filenames.Select(newNameSelector)
' define the operation to rename the file using a keyvaluepair<oldname, newname>
Dim renameOperation = Sub(kvp As KeyValuePair(Of String, String)) File.Move(kvp.Key, kvp.Value)
' ... or it could be multi threaded
Parallel.ForEach(newNames, renameOperation)

Exporting data from PowerPivot

I have an enormous PowerPivot table (839,726 rows), and it is simply too big to copy-paste into a regular spread sheet. I have tried copying it and then reading it directly into R using the line data = read.table("clipboard", header = T), but neither of these approaches work. I am wondering if there is some add-on or method I can use to export my PowerPivot table as a CSV or .xlsx? Thanks very much
Select all the PowerPivot table
Copy the data
Past the data in a text file (for example PPtoR.txt)
Read the text file in R using tab delimiter: read.table("PPtoR.txt", sep="\t"...)
To get a PowerPivot table into Excel:
Create a pivot table based on your PowerPivot data.
Make sure that the pivot table you created has something in values area, but nothing in filters-, columns- or rows areas.
Go to Data > Connections.
Select your Data model and click Properties.
In Usage tab, OLAP Drill Through set the Maximum number of records to retrieve as high as you need (maximum is 9999999 records).
Double-click the measures area in pivot table to drill-through.
another solution is
import the Powerpivot model to PowerBi desktop
export the results from PowerBI desktop using a Powershell script
here is an example
https://github.com/djouallah/PowerBI_Desktop_Export_CSV
A pure Excel / VBA solution is below. This is adapted from the code here to use FileSystemObject and write 1k rows at a time to the file. You'll need to add Microsoft ActiveX Data Objects Library and Microsoft Scripting Runtime as references.
Option Explicit
Public FSO As New FileSystemObject
Public Sub ExportToCsv()
Dim wbTarget As Workbook
Dim ws As Worksheet
Dim rs As Object
Dim sQuery As String
'Suppress alerts and screen updates
With Application
.ScreenUpdating = False
.DisplayAlerts = False
End With
'Bind to active workbook
Set wbTarget = ActiveWorkbook
Err.Clear
On Error GoTo ErrHandler
'Make sure the model is loaded
wbTarget.Model.Initialize
'Send query to the model
sQuery = "EVALUATE <Query>"
Set rs = CreateObject("ADODB.Recordset")
rs.Open sQuery, wbTarget.Model.DataModelConnection.ModelConnection.ADOConnection
Dim CSVData As String
Call WriteRecordsetToCSV(rs, "<ExportPath>", True)
rs.Close
Set rs = Nothing
ExitPoint:
With Application
.ScreenUpdating = True
.DisplayAlerts = True
End With
Set rs = Nothing
Exit Sub
ErrHandler:
MsgBox "An error occured - " & Err.Description, vbOKOnly
Resume ExitPoint
End Sub
Public Sub WriteRecordsetToCSV(rsData As ADODB.Recordset, _
FileName As String, _
Optional ShowColumnNames As Boolean = True, _
Optional NULLStr As String = "")
'Function returns a string to be saved as .CSV file
'Option: save column titles
Dim TxtStr As TextStream
Dim K As Long, CSVData As String
'Open file
Set TxtStr = FSO.CreateTextFile(FileName, True, True)
If ShowColumnNames Then
For K = 0 To rsData.Fields.Count - 1
CSVData = CSVData & ",""" & rsData.Fields(K).Name & """"
Next K
CSVData = Mid(CSVData, 2) & vbNewLine
TxtStr.Write CSVData
End If
Do While rsData.EOF = False
CSVData = """" & rsData.GetString(adClipString, 1000, """,""", """" & vbNewLine & """", NULLStr)
CSVData = Left(CSVData, Len(CSVData) - Iif(rsData.EOF, 3, 2))
TxtStr.Write CSVData
Loop
TxtStr.Close
End Sub
Here is a lovely low-tech way:
https://www.sqlbi.com/articles/linkback-tables-in-powerpivot-for-excel-2013/
I think the process is a little different in Excel 2016. If you have Excel 2016, you just go to the data tab, go to Get External Data, and then Existing Connections (and look under Tables).
The other important thing is to click on Unlink (under Table Tools - Design - External Table Data). This unlinks it from the source data, so it really is just an export.
You can copy that data into another workbook should you wish to.
Data in Power Pivot is modeled, using DAX Studio to export data to csv or SQL.
after finished, you will see that Each model corresponds to a CSV file or SQL table.

Searching C Drive for a file with VBScript

I am very new to VBScript, and I am trying to write a simple script that will extract a file in a directory to a new directory. So far this is what I have (and it works well):
'USER VAR REPRESENTS WINDOWS USERNAME
Set oShell = CreateObject( "WScript.Shell" )
user=oShell.ExpandEnvironmentStrings("%UserName%")
'FOLDER TO BE EXTRACTED
ZipFile="C:\Users\"&user&"\Downloads\Test.zip"
'LOCATION TO EXTRACT FILES
ExtractTo="C:\Users\"&user&"\desktop"
'EXTRACT ZIP FILE
Set objShell = CreateObject("Shell.Application")
Set FilesInZip=objShell.NameSpace(ZipFile).items
objShell.NameSpace(ExtractTo).CopyHere(FilesInZip)
Set fso = Nothing
Set objShell = Nothing
Set oShell = Nothing
Now, if possible, if the "Desktop" folder cannot be found, or the "Test.zip" file cannot be found, I would like to search the C Drive for them, and then proceed with extracting, etc. I have seen some examples, but I cannot understand how to replicate them. How can I search the entire C drive and sub folders for these files?
Help would be appreciated, thanks in advance!
In general a recursive search can be done like this:
Function SearchFolder(fldr, name)
Set SearchFolder = Nothing
For Each f In fldr.Files
If LCase(f.Name) = LCase(name) Then
Set SearchFolder = f
Exit Function
End If
Next
For Each sf In fldr.SubFolders
Set result = SearchFolder(sf, name)
If Not result Is Nothing Then
Set SearchFolder = result
Exit Function
End If
Next
End Function
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = SearchFolder(fso.GetFolder("C:\"), "Test.zip")
However, searching a whole drive that way will take quite some time. Also there are several folders that users don't have access to, so you'll have to account for that if you want to implement a search like this.

Resources