after match, get next line in a file using python - python-3.6

i have a file with multiple lines like this:
Port id: 20
Port Discription: 20
System Name: cisco-sw-1st
System Description:
Cisco 3750cx Switch
i want to get the next line, if the match found in the previous line, how would i do that.
with open("system_detail.txt") as fh:
show_lldp = fh.readlines()
data_lldp = {}
for line in show_lldp:
if line.startswith("System Name: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("Port id: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("System Description:\n"):
# here i Want to get the next line and append it to the dictionary as a value and assign a
# key to it
pass
print()
print(data_lldp)

Iterate each line in text and then use next when match found
Ex:
data_lldp = {}
with open("system_detail.txt") as fh:
for line in fh: #Iterate each line
if line.startswith("System Name: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("Port id: "):
fields = line.strip().split(": ")
data_lldp[fields[0]] = fields[1]
elif line.startswith("System Description:\n"):
data_lldp['Description'] = next(fh) #Use next() to get next line
print()
print(data_lldp)

Check out this question about getting the next value(in your case the next line) in a loop.
Python - Previous and next values inside a loop

Related

The encryption won't decrypt

I was given an encrypted copy of the study guide here, but how do you decrypt and read it???
In a file called pa11.py write a method called decode(inputfile,outputfile). Decode should take two parameters - both of which are strings. The first should be the name of an encoded file (either helloworld.txt or superdupertopsecretstudyguide.txt or yet another file that I might use to test your code). The second should be the name of a file that you will use as an output file.
Your method should read in the contents of the inputfile and, using the scheme described in the hints.txt file above, decode the hidden message, writing to the outputfile as it goes (or all at once when it is done depending on what you decide to use).
The penny math lecture is here.
"""
Program: pennyMath.py
Author: CS 1510
Description: Calculates the penny math value of a string.
"""
# Get the input string
original = input("Enter a string to get its cost in penny math: ")
cost = 0
Go through each character in the input string
for char in original:
value = ord(char) #ord() gives us the encoded number!
if char>="a" and char<="z":
cost = cost+(value-96) #offset the value of ord by 96
elif char>="A" and char<="Z":
cost = cost+(value-64) #offset the value of ord by 64
print("The cost of",original,"is",cost)
Another hint: Don't forget about while loops...
Another hint: After letters -
skip ahead by their pennymath value positions + 2
After numbers - skip ahead by their number + 7 positions
After anything else - just skip ahead by 1 position
The issue I'm having in that I cant seem to get the coding right to decode the file it comes out looking the same. This is the current code I have been using. But once I try to decrypt the message it stays the same.
def pennycost(c):
if c >="a" and c <="z":
return ord(c)-96
elif c>="A" and c<="Z":
return ord(c)-64
def decryption(inputfile,outputfile):
with open(inputfile) as f:
fo = open(outputfile,"w")
count = 0
while True:
c = f.read(1)
if not c:
break;
if count > 0:
count = count -1;
continue
elif c.isalpha():
count = pennycost(c)
fo.write(c)
elif c.isdigit():
count = int(c)
fo.write(c)
else:
count = 6
fo.write(c)
fo.close()
inputfile = input("Please enter the input file name: ")
outputfile = input("Plese enter the output file name(EXISTING FILE WILL BE OVER WRITTEN!): ")
decryption(inputfile,outputfile)

testing errors: comparing 2 files

I have 5 functions working relatively
1- singleline_diff(line1, line2)
comparing 2 line in one file
Inputs:
line1 - first single line string
line2 - second single line string
Output:
the index of the first difference between the two lines
identical if the two lines are the same.
2- singleline_diff_format(line1, line2, idx):
comparing 2 line in one file
Inputs:
line1 - first single line string
line2 - second single line string
idx - index at which to indicate difference (from 1st function)
Output:
abcd (first line)
==^ (= indicate identical character, ^ indicate the difference)
abef (second line)
If either input line contains a newline or carriage return,
then returns an empty string.
If idx is not a valid index, then returns an empty string.
3- multiline_diff(lines1, lines2):
deal with two lists of lines
Inputs:
lines1 - list of single line strings
lines2 - list of single line strings
Output:
a tuple containing the line number (starting from 0) and
the index in that line where the first difference between lines1
and lines2 occurs.
Returns (IDENTICAL, IDENTICAL) if the two lists are the same.
4-get_file_lines(filename)
Inputs:
filename - name of file to read
Output:
a list of lines from the file named filename.
If the file does not exist or is not readable, then the
behavior of this function is undefined.
5- file_diff_format(filename1, filename2) " the function with the problem"
deals with two input files
Inputs:
filename1 - name of first file
filename2 - name of second file
Output:
four line string showing the location of the first
difference between the two files named by the inputs.
If the files are identical, the function instead returns the
string "No differences\n".
If either file does not exist or is not readable, then the
behavior of this function is undefined.
testing the function:
everything goes will until it the test use one empty file
it gave me "list index out of range"
this is the code I use
def file_diff_format(filename1, filename2):
file_1 = get_file_lines(filename1)
file_2 = get_file_lines(filename2)
mli_dif = multiline_diff(file_1, file_2)
min_lens = min(len(file_1), len(file_2))
if mli_dif == (-1,-1) :
return "No differences" "\n"
else:
diff_line_indx = mli_dif[0]
diff_str_indx = int (mli_dif[1])
if len(file_1) >= 0:
line_file_1 = ""
else:
line_file_1 = file_1[diff_line_indx]
if len(file_2) >= 0:
line_file_2 = ""
else:
line_file_2 = file_2[diff_line_indx]
line_file_1 = file_1[diff_line_indx]
line_file_2 = file_2 [diff_line_indx]
out_print = singleline_diff_format(line_file_1, line_file_2, diff_str_indx)
return ( "Line {}{}{}".format ((diff_line_indx), (":\n"), (out_print)))
If one of the files is empty, either file1 or file2 should be an empty list, so that trying to access an element of either would cause the error you describe.
Your code checks for these files to be empty when assigning to line_file_`` andline_file_2`, but then goes ahead and tries to access elements of both.

IndexError: list index out of range, scores.append( (fields[0], fields[1]))

I'm trying to read a file and put contents in a list. I have done this mnay times before and it has worked but this time it throws back the error "list index out of range".
the code is:
with open("File.txt") as f:
scores = []
for line in f:
fields = line.split()
scores.append( (fields[0], fields[1]))
print(scores)
The text file is in the format;
Alpha:[0, 1]
Bravo:[0, 0]
Charlie:[60, 8, 901]
Foxtrot:[0]
I cant see why it is giving me this problem. Is it because I have more than one value for each item? Or is it the fact that I have a colon in my text file?
How can I get around this problem?
Thanks
If I understand you well this code will print you desired result:
import re
with open("File.txt") as f:
# Let's make dictionary for scores {name:scores}.
scores = {}
# Define regular expressin to parse team name and team scores from line.
patternScore = '\[([^\]]+)\]'
patternName = '(.*):'
for line in f:
# Find value for team name and its scores.
fields = re.search(patternScore, line).groups()[0].split(', ')
name = re.search(patternName, line).groups()[0]
# Update dictionary with new value.
scores[name] = fields
# Print output first goes first element of keyValue in dict then goes keyName
for key in scores:
print (scores[key][0] + ':' + key)
You will recieve following output:
60:Charlie
0:Alpha
0:Bravo
0:Foxtrot

how to print recursively a Python dictionary and its subdictionaries with whitespace alignment into columns

I want to create a function that can take a dictionary of dictionaries such as the following
information = {
"sample information": {
"ID": 169888,
"name": "ttH",
"number of events": 124883,
"cross section": 0.055519,
"k factor": 1.0201,
"generator": "pythia8",
"variables": {
"trk_n": 147,
"zappo_n": 9001
}
}
}
and then print it in a neat way such as the following, with alignment of keys and values using whitespace:
sample information:
ID: 169888
name: ttH
number of events: 124883
cross section: 0.055519
k factor: 1.0201
generator: pythia8
variables:
trk_n: 147
zappo_n: 9001
My attempt at the function is the following:
def printDictionary(
dictionary = None,
indentation = ''
):
for key, value in dictionary.iteritems():
if isinstance(value, dict):
print("{indentation}{key}:".format(
indentation = indentation,
key = key
))
printDictionary(
dictionary = value,
indentation = indentation + ' '
)
else:
print(indentation + "{key}: {value}".format(
key = key,
value = value
))
It produces the output like the following:
sample information:
name: ttH
generator: pythia8
cross section: 0.055519
variables:
zappo_n: 9001
trk_n: 147
number of events: 124883
k factor: 1.0201
ID: 169888
As is shown, it successfully prints the dictionary of dictionaries recursively, however is does not align the values into a neat column. What would be some reasonable way of doing this for dictionaries of arbitrary depth?
Try using the pprint module. Instead of writing your own function, you can do this:
import pprint
pprint.pprint(my_dict)
Be aware that this will print characters such as { and } around your dictionary and [] around your lists, but if you can ignore them, pprint() will take care of all the nesting and indentation for you.

Can openoffice count words from console?

i have a small problem i need to count words inside the console to read doc, docx, pptx, ppt, xls, xlsx, odt, pdf ... so don't suggest me | wc -w or grep because they work only with text or console output and they count only spaces and in japanese, chinese, arabic , hindu , hebrew they use diferent delimiter so the word count is wrong and i tried to count with this
pdftotext file.pdf -| wc -w
/usr/local/bin/docx2txt.pl < file.docx | wc -w
/usr/local/bin/pptx2txt.pl < file.pptx | wc -w
antiword file.doc -| wc -w
antiword file.word -| wc -w
in some cases microsoft word , openoffice sad 1000 words and the counters return 10 or 300 words if the language is ( japanese , chinese, hindu ect... ) , but if i use normal characters then i have no issue the biggest mistake is in some case 3 chars less witch is "OK"
i tried to convert with soffice , openoffice and then try WC -w but i can't even convert ,
soffice --headless --nofirststartwizard --accept=socket,host=127.0.0.1,port=8100; --convert-to pdf some.pdf /var/www/domains/vocabridge.com/devel/temp_files/23/0/东京_1000_words_Docx.docx
OR
openoffice.org --headless --convert-to ........
OR
openoffice.org3 --invisible
so if someone know any way to count correctly or display document statistic with openoffice or anything else or linux with the console please share it
thanks.
If you have Microsoft Word (and Windows, obviously) you can write a VBA macro or if you want to run straight from the command line you can write a VBScript script with something like the following:
wordApp = CreateObject("Word.Application")
doc = ... ' open up a Word document using wordApp
docWordCount = doc.Words.Count
' Rinse and repeat...
If you have OpenOffice.org/LibreOffice you have similar (but more) options. If you want to stay in the office app and run a macro you can probably do that. I don't know the StarBasic API well enough to tell you how but I can give you the alternative: creating a Python script to get the word count from the command line. Roughly speaking, you do the following:
Start up your copy of OOo/LibO from the command line with the appropriate parameters to accept incoming socket connections. http://www.openoffice.org/udk/python/python-bridge.html has instructions on how to do that. Go there and use the browser's in-page find feature to search for `accept=socket'
Write a Python script to use the OOo/LibO UNO bridge (basically equivalent to the VBScript example above) to open up your Word/ODT documents one at a time and get the word count from each. The above page should give you a good start to doing that.
You get the word count from a document model object's WordCount property: http://www.openoffice.org/api/docs/common/ref/com/sun/star/text/GenericTextDocument.html#WordCount
Just building on to what #Yawar wrote. Here is is more explicit steps for how to word count with MS word from the console.
I also use the more accurate Range.ComputeStatistics(wdStatisticWords) instead of the Words property. See here for more info: https://support.microsoft.com/en-za/help/291447/word-count-appears-inaccurate-when-you-use-the-vba-words-property
Make a script called wc.vbs and then put this in it:
Set word = CreateObject("Word.Application")
word.Visible = False
Set doc = word.Documents.Open("<replace with absolute path to your .docx/.pdf>")
docWordCount = doc.Range.ComputeStatistics(wdStatisticWords)
word.Quit
Dim StdOut : Set StdOut = CreateObject("Scripting.FileSystemObject").GetStandardStream(1)
WScript.Echo docWordCount & " words"
Open powershell in the directory you saved wc.vbs and run cscript .\wc.vbs and you'll get back the word count :)
I think this may do what you are aiming for
# Continuously updating word count
import unohelper, uno, os, time
from com.sun.star.i18n.WordType import WORD_COUNT
from com.sun.star.i18n import Boundary
from com.sun.star.lang import Locale
from com.sun.star.awt import XTopWindowListener
#socket = True
socket = False
localContext = uno.getComponentContext()
if socket:
resolver = localContext.ServiceManager.createInstanceWithContext('com.sun.star.bridge.UnoUrlResolver', localContext)
ctx = resolver.resolve('uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext')
else: ctx = localContext
smgr = ctx.ServiceManager
desktop = smgr.createInstanceWithContext('com.sun.star.frame.Desktop', ctx)
waittime = 1 # seconds
def getWordCountGoal():
doc = XSCRIPTCONTEXT.getDocument()
retval = 0
# Only if the field exists
if doc.getTextFieldMasters().hasByName('com.sun.star.text.FieldMaster.User.WordCountGoal'):
# Get the field
wordcountgoal = doc.getTextFieldMasters().getByName('com.sun.star.text.FieldMaster.User.WordCountGoal')
retval = wordcountgoal.Content
return retval
goal = getWordCountGoal()
def setWordCountGoal(goal):
doc = XSCRIPTCONTEXT.getDocument()
if doc.getTextFieldMasters().hasByName('com.sun.star.text.FieldMaster.User.WordCountGoal'):
wordcountgoal = doc.getTextFieldMasters().getByName('com.sun.star.text.FieldMaster.User.WordCountGoal')
wordcountgoal.Content = goal
# Refresh the field if inserted in the document from Insert > Fields >
# Other... > Variables > Userdefined fields
doc.TextFields.refresh()
def printOut(txt):
if socket: print txt
else:
model = desktop.getCurrentComponent()
text = model.Text
cursor = text.createTextCursorByRange(text.getEnd())
text.insertString(cursor, txt + '\r', 0)
def hotCount(st):
'''Counts the number of words in a string.
ARGUMENTS:
str st: count the number of words in this string
RETURNS:
int: the number of words in st'''
startpos = long()
nextwd = Boundary()
lc = Locale()
lc.Language = 'en'
numwords = 1
mystartpos = 1
brk = smgr.createInstanceWithContext('com.sun.star.i18n.BreakIterator', ctx)
nextwd = brk.nextWord(st, startpos, lc, WORD_COUNT)
while nextwd.startPos != nextwd.endPos:
numwords += 1
nw = nextwd.startPos
nextwd = brk.nextWord(st, nw, lc, WORD_COUNT)
return numwords
def updateCount(wordCountModel, percentModel):
'''Updates the GUI.
Updates the word count and the percentage completed in the GUI. If some
text of more than one word is selected (including in multiple selections by
holding down the Ctrl/Cmd key), it updates the GUI based on the selection;
if not, on the whole document.'''
model = desktop.getCurrentComponent()
try:
if not model.supportsService('com.sun.star.text.TextDocument'):
return
except AttributeError: return
sel = model.getCurrentSelection()
try: selcount = sel.getCount()
except AttributeError: return
if selcount == 1 and sel.getByIndex(0).getString == '':
selcount = 0
selwords = 0
for nsel in range(selcount):
thisrange = sel.getByIndex(nsel)
atext = thisrange.getString()
selwords += hotCount(atext)
if selwords > 1: wc = selwords
else:
try: wc = model.WordCount
except AttributeError: return
wordCountModel.Label = str(wc)
if goal != 0:
pc_text = 100 * (wc / float(goal))
#pc_text = '(%.2f percent)' % (100 * (wc / float(goal)))
percentModel.ProgressValue = pc_text
else:
percentModel.ProgressValue = 0
# This is the user interface bit. It looks more or less like this:
###############################
# Word Count _ o x #
###############################
# _____ #
# 451 / |500| #
# ----- #
# ___________________________ #
# |############## | #
# --------------------------- #
###############################
# The boxed `500' is the text entry box.
class WindowClosingListener(unohelper.Base, XTopWindowListener):
def __init__(self):
global keepGoing
keepGoing = True
def windowClosing(self, e):
global keepGoing
keepGoing = False
setWordCountGoal(goal)
e.Source.setVisible(False)
def addControl(controlType, dlgModel, x, y, width, height, label, name = None):
control = dlgModel.createInstance(controlType)
control.PositionX = x
control.PositionY = y
control.Width = width
control.Height = height
if controlType == 'com.sun.star.awt.UnoControlFixedTextModel':
control.Label = label
elif controlType == 'com.sun.star.awt.UnoControlEditModel':
control.Text = label
elif controlType == 'com.sun.star.awt.UnoControlProgressBarModel':
control.ProgressValue = label
if name:
control.Name = name
dlgModel.insertByName(name, control)
else:
control.Name = 'unnamed'
dlgModel.insertByName('unnamed', control)
return control
def loopTheLoop(goalModel, wordCountModel, percentModel):
global goal
while keepGoing:
try: goal = int(goalModel.Text)
except: goal = 0
updateCount(wordCountModel, percentModel)
time.sleep(waittime)
if not socket:
import threading
class UpdaterThread(threading.Thread):
def __init__(self, goalModel, wordCountModel, percentModel):
threading.Thread.__init__(self)
self.goalModel = goalModel
self.wordCountModel = wordCountModel
self.percentModel = percentModel
def run(self):
loopTheLoop(self.goalModel, self.wordCountModel, self.percentModel)
def wordCount(arg = None):
'''Displays a continuously updating word count.'''
dialogModel = smgr.createInstanceWithContext('com.sun.star.awt.UnoControlDialogModel', ctx)
dialogModel.PositionX = XSCRIPTCONTEXT.getDocument().CurrentController.Frame.ContainerWindow.PosSize.Width / 2.2 - 105
dialogModel.Width = 100
dialogModel.Height = 30
dialogModel.Title = 'Word Count'
lblWc = addControl('com.sun.star.awt.UnoControlFixedTextModel', dialogModel, 6, 2, 25, 14, '', 'lblWc')
lblWc.Align = 2 # Align right
addControl('com.sun.star.awt.UnoControlFixedTextModel', dialogModel, 33, 2, 10, 14, ' / ')
txtGoal = addControl('com.sun.star.awt.UnoControlEditModel', dialogModel, 45, 1, 25, 12, '', 'txtGoal')
txtGoal.Text = goal
#addControl('com.sun.star.awt.UnoControlFixedTextModel', dialogModel, 6, 25, 50, 14, '(percent)', 'lblPercent')
ProgressBar = addControl('com.sun.star.awt.UnoControlProgressBarModel', dialogModel, 6, 15, 88, 10,'' , 'lblPercent')
ProgressBar.ProgressValueMin = 0
ProgressBar.ProgressValueMax =100
#ProgressBar.Border = 2
#ProgressBar.BorderColor = 255
#ProgressBar.FillColor = 255
#ProgressBar.BackgroundColor = 255
addControl('com.sun.star.awt.UnoControlFixedTextModel', dialogModel, 124, 2, 12, 14, '', 'lblMinus')
controlContainer = smgr.createInstanceWithContext('com.sun.star.awt.UnoControlDialog', ctx)
controlContainer.setModel(dialogModel)
controlContainer.addTopWindowListener(WindowClosingListener())
controlContainer.setVisible(True)
goalModel = controlContainer.getControl('txtGoal').getModel()
wordCountModel = controlContainer.getControl('lblWc').getModel()
percentModel = controlContainer.getControl('lblPercent').getModel()
ProgressBar.ProgressValue = percentModel.ProgressValue
if socket:
loopTheLoop(goalModel, wordCountModel, percentModel)
else:
uthread = UpdaterThread(goalModel, wordCountModel, percentModel)
uthread.start()
keepGoing = True
if socket:
wordCount()
else:
g_exportedScripts = wordCount,
Link for more info
https://superuser.com/questions/529446/running-word-count-in-openoffice-writer
Hope this helps regards tom
EDIT : Then i found this
http://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=22555
wc can understand Unicode and uses system's iswspace function to find whether the unicode character is whitespace. "The iswspace() function tests whether wc is a wide-character code representing a character of class space in the program's current locale." So, wc -w should be able to correctly count words if your locale (LC_CTYPE) is configured correctly.
The source code of the wc program
The manual page for the iswspace function
I found the answer create one service
#!/bin/sh
#
# chkconfig: 345 99 01
#
# description: your script is a test service
#
(while sleep 1; do
ls pathwithfiles/in | while read file; do
libreoffice --headless -convert-to pdf "pathwithfiles/in/$file" --outdir pathwithfiles/out
rm "pathwithfiles/in/$file"
done
done) &
then the php script that i needed counted everything
$ext = pathinfo($absolute_file_path, PATHINFO_EXTENSION);
if ($ext !== 'txt' && $ext !== 'pdf') {
// Convert to pdf
$tb = mktime() . mt_rand();
$tempfile = 'locationofpdfs/in/' . $tb . '.' . $ext;
copy($absolute_file_path, $tempfile);
$absolute_file_path = 'locationofpdfs/out/' . $tb . '.pdf';
$ext = 'pdf';
while (!is_file($absolute_file_path)) sleep(1);
}
if ($ext !== 'txt') {
// Convert to txt
$tempfile = tempnam(sys_get_temp_dir(), '');
shell_exec('pdftotext "' . $absolute_file_path . '" ' . $tempfile);
$absolute_file_path = $tempfile;
$ext = 'txt';
}
if ($ext === 'txt') {
$seq = '/[\s\.,;:!\? ]+/mu';
$plain = file_get_contents($absolute_file_path);
$plain = preg_replace('#\{{{.*?\}}}#su', "", $plain);
$str = preg_replace($seq, '', $plain);
$chars = count(preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY));
$words = count(preg_split($seq, $plain, -1, PREG_SPLIT_NO_EMPTY));
if ($words === 0) return $chars;
if ($chars / $words > 10) $words = $chars;
return $words;
}

Resources