I am trying to find the smallest index containing the value i in a sorted array. If this i value is not present I want -1 to be returned. I am using a binary search recursive subroutine. The problem is that I can't really stop this recursion and I get lot of answers(one right and the rest wrong). And sometimes I get an error called "segmentation fault: 11" and I don't really get any results.
I've tried to delete this call random_number since I already have a sorted array in my main program, but it did not work.
program main
implicit none
integer, allocatable :: A(:)
real :: MAX_VALUE
integer :: i,j,n,s, low, high
real :: x
N= 10 !size of table
MAX_VALUE = 10
allocate(A(n))
s = 5 ! searched value
low = 1 ! lower limit
high = n ! highest limit
!generate random table of numbers (from 0 to 1000)
call Random_Seed
do i=1, N
call Random_Number(x) !returns random x >= 0 and <1
A(i)= anint(MAX_VALUE*x)
end do
call bubble(n,a)
print *,' '
write(*,10) (a(i),i=1,N)
10 format(10i6)
call bsearch(A,n,s,low,high)
deallocate(A)
end program main
The sort subroutine:
subroutine sort(p,q)
implicit none
integer(kind=4), intent(inout) :: p, q
integer(kind=4) :: temp
if (p>q) then
temp = p
p = q
q = temp
end if
return
end subroutine sort
The bubble subroutine:
subroutine bubble(n,arr)
implicit none
integer(kind=4), intent(inout) :: n
integer(kind=4), intent(inout) :: arr(n)
integer(kind=4) :: sorted(n)
integer :: i,j
do i=1, n
do j=n, i+1, -1
call sort(arr(j-1), arr(j))
end do
end do
return
end subroutine bubble
recursive subroutine bsearch(b,n,i,low,high)
implicit none
integer(kind=4) :: b(n)
integer(kind=4) :: low, high
integer(kind=4) :: i,j,x,idx,n
real(kind=4) :: r
idx = -1
call random_Number(r)
x = low + anint((high - low)*r)
if (b(x).lt.i) then
low = x + 1
call bsearch(b,n,i,low,high)
else if (b(x).gt.i) then
high = x - 1
call bsearch(b,n,i,low,high)
else
do j = low, high
if (b(j).eq.i) then
idx = j
exit
end if
end do
end if
! Stop if high = low
if (low.eq.high) then
return
end if
print*, i, 'found at index ', idx
return
end subroutine bsearch
The goal is to get the same results as my linear search. But I'am getting either of these answers.
Sorted table:
1 1 2 4 5 5 6 7 8 10
5 found at index 5
5 found at index -1
5 found at index -1
or if the value is not found
2 2 3 4 4 6 6 7 8 8
Segmentation fault: 11
There are a two issues causing your recursive search routine bsearch to either stop with unwanted output, or result in a segmentation fault. Simply following the execution logic of your program at the hand of the examples you provided, elucidate the matter:
1) value present and found, unwanted output
First, consider the first example where array b contains the value i=5 you are searching for (value and index pointed out with || in the first two lines of the code block below). Using the notation Rn to indicate the the n'th level of recursion, L and H for the lower- and upper bounds and x for the current index estimate, a given run of your code could look something like this:
b(x): 1 1 2 4 |5| 5 6 7 8 10
x: 1 2 3 4 |5| 6 7 8 9 10
R0: L x H
R1: Lx H
R2: L x H
5 found at index 5
5 found at index -1
5 found at index -1
In R0 and R1, the tests b(x).lt.i and b(x).gt.i in bsearch work as intended to reduce the search interval. In R2 the do-loop in the else branch is executed, idx is assigned the correct value and this is printed - as intended. However, a return statement is now encountered which will return control to the calling program unit - in this case that is first R1(!) where execution will resume after the if-else if-else block, thus printing a message to screen with the initial value of idx=-1. The same happens upon returning from R0 to the main program. This explains the (unwanted) output you see.
2) value not present, segmentation fault
Secondly, consider the example resulting in a segmentation fault. Using the same notation as before, a possible run could look like this:
b(x): 2 2 3 4 4 6 6 7 8 8
x: 1 2 3 4 5 6 7 8 9 10
R0: L x H
R1: L x H
R2: L x H
R3: LxH
R4: H xL
.
.
.
Segmentation fault: 11
In R0 to R2 the search interval is again reduced as intended. However, in R3 the logic fails. Since the search value i is not present in array b, one of the .lt. or .gt. tests will always evaluate to .true., meaning that the test for low .eq. high to terminate a search is never reached. From this point onwards, the logic is no longer valid (e.g. high can be smaller than low) and the code will continue deepening the level of recursion until the call stack gets too big and a segmentation fault occurs.
These explained the main logical flaws in the code. A possible inefficiency is the use of a do-loop to find the lowest index containing a searched for value. Consider a case where the value you are looking for is e.g. i=8, and that it appears in the last position in your array, as below. Assume further that by chance, the first guess for its position is x = high. This implies that your code will immediately branch to the do-loop, where in effect a linear search is done of very nearly the entire array, to find the final result idx=9. Although correct, the intended binary search rather becomes a linear search, which could result in reduced performance.
b(x): 2 2 3 4 4 6 6 7 |8| 8
x: 1 2 3 4 5 6 7 8 |9| 10
R0: L xH
8 found at index 9
Fixing the problems
At the very least, you should move the low .eq. high test to the start of the bsearch routine, so that recursion stops before invalid bounds can be defined (you then need an additional test to see if the search value was found or not). Also, notify about a successful search right after it occurs, i.e. after the equality test in your do-loop, or the additional test just mentioned. This still does not address the inefficiency of a possible linear search.
All taken into account, you are probably better off reading up on algorithms for finding a "leftmost" index (e.g. on Wikipedia or look at a tried and tested implementation - both examples here use iteration instead of recursion, perhaps another improvement, but the same principles apply) and adapt that to Fortran, which could look something like this (only showing new code, ...refer to existing code in your examples):
module mod_search
implicit none
contains
! Function that uses recursive binary search to look for `key` in an
! ordered `array`. Returns the array index of the leftmost occurrence
! of `key` if present in `array`, and -1 otherwise
function search_ordered (array, key) result (idx)
integer, intent(in) :: array(:)
integer, intent(in) :: key
integer :: idx
! find left most array index that could possibly hold `key`
idx = binary_search_left(1, size(array))
! if `key` is not found, return -1
if (array(idx) /= key) then
idx = -1
end if
contains
! function for recursive reduction of search interval
recursive function binary_search_left(low, high) result(idx)
integer, intent(in) :: low, high
integer :: idx
real :: r
if (high <= low ) then
! found lowest possible index where target could be
idx = low
else
! new guess
call random_number(r)
idx = low + floor((high - low)*r)
! alternative: idx = low + (high - low) / 2
if (array(idx) < key) then
! continue looking to the right of current guess
idx = binary_search_left(idx + 1, high)
else
! continue looking to the left of current guess (inclusive)
idx = binary_search_left(low, idx)
end if
end if
end function binary_search_left
end function search_ordered
! Move your routines into a module
subroutine sort(p,q)
...
end subroutine sort
subroutine bubble(n,arr)
...
end subroutine bubble
end module mod_search
! your main program
program main
use mod_search, only : search_ordered, sort, bubble ! <---- use routines from module like so
implicit none
...
! Replace your call to bsearch() with the following:
! call bsearch(A,n,s,low,high)
i = search_ordered(A, s)
if (i /= -1) then
print *, s, 'found at index ', i
else
print *, s, 'not found!'
end if
...
end program main
Finally, depending on your actual use case, you could also just consider using the Fortran intrinsic procedure minloc saving you the trouble of implementing all this functionality yourself. In this case, it can be done by making the following modification in your main program:
! i = search_ordered(a, s) ! <---- comment out this line
j = minloc(abs(a-s), dim=1) ! <---- replace with these two
i = merge(j, -1, a(j) == s)
where j returned from minloc will be the lowest index in the array a where s may be found, and merge is used to return j when a(j) == s and -1 otherwise.
Hi I'm trying to understand how the macro #isdefined works.
I was expecting Chunk 1 to print out 1 2 3 4, but it is not printing anything.
Also related, I was expecting chunk 2 to print out 2 3 4 5, but it is throwing an error: "a is not defined".
# Chunk 1
for i = 1:5
if #isdefined a
print(a)
end
a = i
end
# Chunk 2
for i = 1:5
if i > 1
print(a)
end
a = i
end
Could someone help explain what is wrong about each chunk? Thank you.
The reason is that a is a local variable in the scope of for loop. Now the crucial part is that for loop follows the following rule defined here:
for loops, while loops, and comprehensions have the following behavior: any new variables introduced in their body scopes are freshly allocated for each loop iteration
This means that assignment to a at the end of the loop does not carry over to the next iteration, because when the new iteration starts the old value of a is discarded as a is freshly allocated. It only gets defined after a=i assignment.
Therefore you have the following behavior:
julia> for i = 1:5
if #isdefined a
println("before: ", a)
end
a = i
if #isdefined a
println("after: ", a)
end
end
after: 1
after: 2
after: 3
after: 4
after: 5
However, if a is defined in an outer scope, then its value is not for loop local and is preserved between iterations, so you have for instance:
julia> let a
for i = 1:5
if #isdefined a
println("before: ", a)
end
a = i
if #isdefined a
println("after: ", a)
end
end
end
after: 1
before: 1
after: 2
before: 2
after: 3
before: 3
after: 4
before: 4
after: 5
and
julia> let a
for i = 1:5
if i > 1
println(a)
end
a = i
end
end
1
2
3
4
I have used let block but it could be any kind of outer scope except global scope (in which case you would have to change a = i to global a = i to get the same effect).
I'm trying to get my head around Julia, coming from Python. Currently working through some Project Euler problems I've solved using Python in Julia to get a better feeling for the language. One thing that I do a lot (in Project Euler and in real life) is to parse a big multiline data object into an array. For example, if I have the data
data = """1 2 3 4
5 6 7 8
9 0 1 2"""
In python I might do
def parse(input):
output = []
for line in input.splitlines():
output.append(map(int,line.split()))
return np.array(output)
Here's what I have so far in Julia:
function parse(input)
nrow = ncol = 0
# Count first
for row=split(input,'\n')
nrow += 1
ncol = max(ncol,length(split(row)))
end
output = zeros(Int64,(nrow,ncol))
for (i,row) in enumerate(split(input,'\n'))
for (j,word) in enumerate(split(row))
output[i,j] = int(word)
end
end
return output
end
What's the Julia version of "pythonic" called? Whatever it is, I don't think I'm doing it. I'm pretty sure there's a way to (1) not have to pass through the data twice, (2) not have to be so specific about allocating the array. I've tried hcat/vcat a little, without luck.
I'd welcome suggestions for solving this. I'd also be interested in references to proper Julia style (julia-onic?), and general language usage practices. Thanks!
readdlm is really useful here. See the docs for all the options, but here's an example.
julia> data="1 2 3 4
5 6 7 8
9 0 1 2"
"1 2 3 4\n5 6 7 8\n9 0 1 2"
julia> readdlm(IOBuffer(data))
3x4 Array{Float64,2}:
1.0 2.0 3.0 4.0
5.0 6.0 7.0 8.0
9.0 0.0 1.0 2.0
julia> readdlm(IOBuffer(data),Int)
3x4 Array{Int32,2}:
1 2 3 4
5 6 7 8
9 0 1 2
I'm using doSNOW- package for parallelizing tasks, which differ in length. When one thread is finished, I want
some information generated by old threads passed to the next thread
start the next thread immediatly (loadbalancing like in clusterApplyLB)
It works in singlethreaded (see makeClust(spec = 1 ))
#Register Snow and doSNOW
require(doSNOW)
#CHANGE spec to 4 or more, to see what my problem is
registerDoSNOW(cl <- makeCluster(spec=1,type="SOCK",outfile=""))
numbersProcessed <- c() # init processed vector
x <- foreach(i = 1:10,.export=numbersProcessed) %dopar% {
#Do working stuff
cat(format(Sys.time(), "%X"),": ","Starting",i,"(Numbers processed so far:",numbersProcessed, ")\n")
Sys.sleep(time=i)
#Appends this number to general vector
numbersProcessed <- append(numbersProcessed,i)
cat(format(Sys.time(), "%X"),": ","Ending",i,"\n")
cat("--------------------\n")
}
#End it all
stopCluster(cl)
Now change the spec in "makeCluster" to 4. Output is something like this:
[..]
Type: EXEC
18:12:21 : Starting 9 (Numbers processed so far: 1 5 )
18:12:23 : Ending 6
--------------------
Type: EXEC
18:12:23 : Starting 10 (Numbers processed so far: 2 6 )
18:12:25 : Ending 7
At 18:12:21 thread 9 knew, that thread 1 and 5 have been processed. 2 seconds later thread 6 ends. The next thread has to know at least about 1, 5 and 6, right?. But thread 10 only knows about 6 and 2.
I realized, this has to do something with the cores specified in makeCluster. 9 knows about 1, 5 and 9 (1 + 4 + 4), 10 knows about 2,6 and 10 (2 + 4 + 4).
Is there a better way to pass "processed" stuff to further generations of threads?
Bonuspoints: Is there a way to "print" to the master- node in parallel processing, without having these "Type: EXEC" etc messages from the snow package? :)
Thanks!
Marc
My bad. Damn.
I thought, foreach with %dopar% is load-balanced. This isn't the case, and makes my question absolete, because there can nothing be executed on the host-side while parallel processing. That explains why global variables are only manipulated on the client side and never reach the host.
When executing:
def guess(a..b) do
IO.puts "In rn = #{a}..#{b}"
guess(a..b, IO.getn("Is it greater than #{div(a + b, 2)} ? : ", 1) |> String.upcase == "Y")
end
def guess(a..b, true) do
guess(div(a + b, 2)..b)
end
def guess(a..b, false) do
guess(a..div(a + b, 2))
end
Results:
iex(1)> Test.guess(1..10)
1 In rn = 1..10
2 Is it greater than 5 ? : y
3 In rn = 5..10
4 Is it greater than 7 ? :
5 In rn = 5..7
6 Is it greater than 6 ? : n
7 In rn = 5..6
8 Is it greater than 5 ? :
9 In rn = 5..5
10 Is it greater than 5 ? : y
11 In rn = 5..5
12 Is it greater than 5 ? :
13 In rn = 5..5
14 Is it greater than 5 ? :
iex did not wait for user input on lines 4, 8, & 12 - after receiving an input, it appears to run through the loop twice.
Why might that be?
Solved:
Apparently, something weird happens with IO.getn when used in this manner - perhaps reading "Y" as a byte, and "enter" as a separate byte. Replacing IO.gets and no character count seems to fix the problem. Alternatively, isolating the getn method call might keep this issue from occurring.
You are correct. When in the terminal, IO.getn/1 only returns the bytes after you enter a new line, which means if you are reading byte per byte recursively, you are going to receive two bytes, one for the user command and another for the new line. IO.gets/1 is the way to go here.