IndexError: list index out of range, scores.append( (fields[0], fields[1])) - python-3.4

I'm trying to read a file and put contents in a list. I have done this mnay times before and it has worked but this time it throws back the error "list index out of range".
the code is:
with open("File.txt") as f:
scores = []
for line in f:
fields = line.split()
scores.append( (fields[0], fields[1]))
print(scores)
The text file is in the format;
Alpha:[0, 1]
Bravo:[0, 0]
Charlie:[60, 8, 901]
Foxtrot:[0]
I cant see why it is giving me this problem. Is it because I have more than one value for each item? Or is it the fact that I have a colon in my text file?
How can I get around this problem?
Thanks

If I understand you well this code will print you desired result:
import re
with open("File.txt") as f:
# Let's make dictionary for scores {name:scores}.
scores = {}
# Define regular expressin to parse team name and team scores from line.
patternScore = '\[([^\]]+)\]'
patternName = '(.*):'
for line in f:
# Find value for team name and its scores.
fields = re.search(patternScore, line).groups()[0].split(', ')
name = re.search(patternName, line).groups()[0]
# Update dictionary with new value.
scores[name] = fields
# Print output first goes first element of keyValue in dict then goes keyName
for key in scores:
print (scores[key][0] + ':' + key)
You will recieve following output:
60:Charlie
0:Alpha
0:Bravo
0:Foxtrot

Related

The encryption won't decrypt

I was given an encrypted copy of the study guide here, but how do you decrypt and read it???
In a file called pa11.py write a method called decode(inputfile,outputfile). Decode should take two parameters - both of which are strings. The first should be the name of an encoded file (either helloworld.txt or superdupertopsecretstudyguide.txt or yet another file that I might use to test your code). The second should be the name of a file that you will use as an output file.
Your method should read in the contents of the inputfile and, using the scheme described in the hints.txt file above, decode the hidden message, writing to the outputfile as it goes (or all at once when it is done depending on what you decide to use).
The penny math lecture is here.
"""
Program: pennyMath.py
Author: CS 1510
Description: Calculates the penny math value of a string.
"""
# Get the input string
original = input("Enter a string to get its cost in penny math: ")
cost = 0
Go through each character in the input string
for char in original:
value = ord(char) #ord() gives us the encoded number!
if char>="a" and char<="z":
cost = cost+(value-96) #offset the value of ord by 96
elif char>="A" and char<="Z":
cost = cost+(value-64) #offset the value of ord by 64
print("The cost of",original,"is",cost)
Another hint: Don't forget about while loops...
Another hint: After letters -
skip ahead by their pennymath value positions + 2
After numbers - skip ahead by their number + 7 positions
After anything else - just skip ahead by 1 position
The issue I'm having in that I cant seem to get the coding right to decode the file it comes out looking the same. This is the current code I have been using. But once I try to decrypt the message it stays the same.
def pennycost(c):
if c >="a" and c <="z":
return ord(c)-96
elif c>="A" and c<="Z":
return ord(c)-64
def decryption(inputfile,outputfile):
with open(inputfile) as f:
fo = open(outputfile,"w")
count = 0
while True:
c = f.read(1)
if not c:
break;
if count > 0:
count = count -1;
continue
elif c.isalpha():
count = pennycost(c)
fo.write(c)
elif c.isdigit():
count = int(c)
fo.write(c)
else:
count = 6
fo.write(c)
fo.close()
inputfile = input("Please enter the input file name: ")
outputfile = input("Plese enter the output file name(EXISTING FILE WILL BE OVER WRITTEN!): ")
decryption(inputfile,outputfile)

tuple index out of range for printing a index value

While executing the following code i'm getting below error, Just for information matchObj here returns a tuple value ..
$ ./ftpParser3_re_dup.py
Traceback (most recent call last):
File "./ftpParser3_re_dup.py", line 13, in <module>
print("{0:<30}{1:<20}{2:<50}{3:<15}".format("FTP ACCOUNT","Account Type","Term Flag"))
IndexError: tuple index out of range
Code is below:
from __future__ import print_function
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE,SIG_DFL)
import re
with open('all_adta', 'r') as f:
for line in f:
line = line.strip()
data = f.read()
# Making description & termflag optional in the regex pattern as it's missing in the "data_test" file with several occurrences.
regex = (r"dn:(.*?)\nftpuser: (.*)\n(?:description:* (.*))?\n(?:termflag:* (.*))")
matchObj = re.findall(regex, data)
print("{0:<30}{1:<20}{2:<50}{3:<15}".format("FTP ACCOUNT","Account Type","Term Flag"))
print("{0:<30}{1:<20}{2:<50}{3:<15}".format("-----------","------------","--------"))
for index in matchObj:
index_str = ' '.join(index)
new_str = re.sub(r'[=,]', ' ', index_str)
new_str = new_str.split()
# In below print statement we are using "index[2]" as index is tuple here, this is because
# findall() returns the matches as a list, However with groups, it returns it as a list of tuples.
print("{0:<30}{1:<20}{2:<50}{3:<15}".format(new_str[1],new_str[8],index[2],index[3]))
In the line print("{0:<30}{1:<20}{2:<50}{3:<15}".format("FTP ACCOUNT","Account Type","Term Flag")) you have mentioned 4 indices but given only 3 i.e. "FTP ACCOUNT","Account Type","Term Flag"
Remove the 4th index or add a new one

testing errors: comparing 2 files

I have 5 functions working relatively
1- singleline_diff(line1, line2)
comparing 2 line in one file
Inputs:
line1 - first single line string
line2 - second single line string
Output:
the index of the first difference between the two lines
identical if the two lines are the same.
2- singleline_diff_format(line1, line2, idx):
comparing 2 line in one file
Inputs:
line1 - first single line string
line2 - second single line string
idx - index at which to indicate difference (from 1st function)
Output:
abcd (first line)
==^ (= indicate identical character, ^ indicate the difference)
abef (second line)
If either input line contains a newline or carriage return,
then returns an empty string.
If idx is not a valid index, then returns an empty string.
3- multiline_diff(lines1, lines2):
deal with two lists of lines
Inputs:
lines1 - list of single line strings
lines2 - list of single line strings
Output:
a tuple containing the line number (starting from 0) and
the index in that line where the first difference between lines1
and lines2 occurs.
Returns (IDENTICAL, IDENTICAL) if the two lists are the same.
4-get_file_lines(filename)
Inputs:
filename - name of file to read
Output:
a list of lines from the file named filename.
If the file does not exist or is not readable, then the
behavior of this function is undefined.
5- file_diff_format(filename1, filename2) " the function with the problem"
deals with two input files
Inputs:
filename1 - name of first file
filename2 - name of second file
Output:
four line string showing the location of the first
difference between the two files named by the inputs.
If the files are identical, the function instead returns the
string "No differences\n".
If either file does not exist or is not readable, then the
behavior of this function is undefined.
testing the function:
everything goes will until it the test use one empty file
it gave me "list index out of range"
this is the code I use
def file_diff_format(filename1, filename2):
file_1 = get_file_lines(filename1)
file_2 = get_file_lines(filename2)
mli_dif = multiline_diff(file_1, file_2)
min_lens = min(len(file_1), len(file_2))
if mli_dif == (-1,-1) :
return "No differences" "\n"
else:
diff_line_indx = mli_dif[0]
diff_str_indx = int (mli_dif[1])
if len(file_1) >= 0:
line_file_1 = ""
else:
line_file_1 = file_1[diff_line_indx]
if len(file_2) >= 0:
line_file_2 = ""
else:
line_file_2 = file_2[diff_line_indx]
line_file_1 = file_1[diff_line_indx]
line_file_2 = file_2 [diff_line_indx]
out_print = singleline_diff_format(line_file_1, line_file_2, diff_str_indx)
return ( "Line {}{}{}".format ((diff_line_indx), (":\n"), (out_print)))
If one of the files is empty, either file1 or file2 should be an empty list, so that trying to access an element of either would cause the error you describe.
Your code checks for these files to be empty when assigning to line_file_`` andline_file_2`, but then goes ahead and tries to access elements of both.

how to iterate over multiple links and scrape everyone of them one by one and save the output in csv using python beautifulsoup and requests

I have this code but don't know how to read the links from a CSV or a list. I want to read the links and scrape details off every single link and then save the data in columns respected to each link into an output CSV.
Here is the code I built to get specific data.
from bs4 import BeautifulSoup
import requests
url = "http://www.ebay.com/itm/282231178856"
r = requests.get(url)
x = BeautifulSoup(r.content, "html.parser")
# print(x.prettify().encode('utf-8'))
# time to find some tags!!
# y = x.find_all("tag")
z = x.find_all("h1", {"itemprop": "name"})
# print z
# for loop done to extracting the title.
for item in z:
try:
print item.text.replace('Details about ', '')
except:
pass
# category extraction done
m = x.find_all("span", {"itemprop": "name"})
# print m
for item in m:
try:
print item.text
except:
pass
# item condition extraction done
n = x.find_all("div", {"itemprop": "itemCondition"})
# print n
for item in n:
try:
print item.text
except:
pass
# sold number extraction done
k = x.find_all("span", {"class": "vi-qtyS vi-bboxrev-dsplblk vi-qty-vert-algn vi-qty-pur-lnk"})
# print k
for item in k:
try:
print item.text
except:
pass
# Watchers extraction done
u = x.find_all("span", {"class": "vi-buybox-watchcount"})
# print u
for item in u:
try:
print item.text
except:
pass
# returns details extraction done
t = x.find_all("span", {"id": "vi-ret-accrd-txt"})
# print t
for item in t:
try:
print item.text
except:
pass
#per hour day view done
a = x.find_all("div", {"class": "vi-notify-new-bg-dBtm"})
# print a
for item in a:
try:
print item.text
except:
pass
#trending at price
b = x.find_all("span", {"class": "mp-prc-red"})
#print b
for item in b:
try:
print item.text
except:
pass
Your question is kind of vague!
Which links are you talking about? There are a hundred on a single ebay page. Which infos would you like to scrape? Similarly there is also a ton.
But anyway, here is I would proceed:
# First, create a list of urls you want to iterate on
urls = []
soup = (re.text, "html.parser")
# Assuming your links of interests are values of "href" attributes within <a> tags
a_tags = soup.find_all("a")
for tag in a_tags:
urls.append(tag["href"])
# Second, start to iterate while storing the info
info_1, info_2 = [], []
for link in urls:
# Do stuff here, maybe its time to define your existing loops as functions?
info_a, info_b = YourFunctionReturningValues(soup)
info_1.append(info_a)
info_2.append(info_b)
Then if you want a nice csv output:
# Don't forget to import the csv module
with open(r"path_to_file.csv", "wb") as my_file:
csv_writer = csv.writer(final_csv, delimiter = ",")
csv_writer.writerows(zip(urls, info_1, info_2, info_3))
Hope this will help?
Of course, don't hesitate to give additional info, so to have additional details
On attributes with BeautifulSoup: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attributes
About the csv module: https://docs.python.org/2/library/csv.html

Julia dictionary "key not found" only when using loop

Still trying to figure out this problem (I was having problems building a dictionary, but managed to get that working thanks to rickhg12hs).
Here's my current code:
#open files with codon:amino acid pairs, initiate dictionary:
file = open(readall, "rna_codons.txt")
seq = open(readall, "rosalind_prot.txt")
codons = {"UAA" => "stop", "UGA" => "stop", "UAG" => "stop"}
#generate dictionary entries using pairs from file:
for m in eachmatch(r"([AUGC]{3,3})\s([A-Z])\s", file)
codon, aa = m.captures
codons[codon] = aa
end
All of that code seems to work as intended. At this point, I have the dictionary I want, and the right keys point to the right entries. If I just do print(codons["AUG"]) for example, it prints 'M', which is the correct output. Now I want to scan through a string in the second file, and for every 3 letters, pull out the entry referenced in the dictionary and add it to the prot string. So I tried:
for m in eachmatch(r"([AUGC]{3,3})", seq)
amac = codons[m.captures]
prot = "$prot$amac"
end
But this kicks out the error key not found: ["AUG"]. I know the key exists, because I can print codons["AUG"] and it returns the proper entry, so why can't it find that key when it's in the loop?

Resources