I am trying to implement an internal search for my website that can point users in the right direction in case the mistype a word, something like the did you mean : in google search.
Does anybody have an idea how such a search can be done? How can we establish the relevance of the word or the phrase we assume the user intended to search for?
i use asp.net and sql server 2005 with FTS (fullTextSearch)
Thank you
You could use an algorithm for determining string similarity and then suggest other string from your search index up to a certain difference.
One of these algorithms is the Levenshtein distance.
However, don't forget searching for existing solutions. I think e.g. Lucene has the capability to search for similar strings.
Btw, here's a related post on this topic: How does the Google “Did you mean?” Algorithm work?
This is done querying through regular expression the closest keywords that match the phrase.
Here is a great article that might help you.
With T-SQL You can use the SOUNDEX function to compare words phonetically.
If you take the users input and then compare it with other words in your database by soundex code, you should be able to come up with a list of 'do you mean'? words.
E.g.
select SOUNDEX('andrew')
select SOUNDEX('androo')
will both produce the same output (A536).
There are better algorithms these days, but soundex is built into sql server.
The simplest approach I can think of is to write a function that returns the degree of mismatch between two words, and you loop through all the words and find the best ones.
I've done this with a branch-and-bound method. Let me dig up the code:
bool matchWithinBound(char* a, char* b, int bound){
// skip over matching characters
while(*a && *b && *a == *b){a++; b++;}
if (*a==0 && *b==0) return true;
// if bound too low, quit
if (bound <= 0) return false;
// try assuming a has an extra character
if (*a && matchWithinBound(a+1, b, bound-1)) return true;
// try assuming a had a letter deleted
if (*b && matchWithinBound(a, b+1, bound-1)) return true;
// try assuming a had a letter replaced
if (*a && *b && matchWithinBound(a+1, b+1, bound-1)) return true;
// try assuming a had two adjacent letters swapped
if (a[0] && a[1]){
char temp;
int success;
temp = a[0]; a[0] = a[1]; a[1] = temp;
success = matchWithinBounds(a, b, bound-1);
temp = a[0]; a[0] = a[1]; a[1] = temp;
if (success) return true;
}
// can try other modifications
return false;
}
int DistanceBetweenWords(char* a, char* b){
int bound = 0;
for (bound = 0; bound < 10; bound++){
if (matchWithinBounds(a, b, bound)) return bound;
}
return 1000;
}
why don't you use google power?, you can consume their suggest service
here is an example on c#
Related
I'm trying to find the number of nodes in a BST using recursion. Here is my code
struct Node{
int key;
struct Node* left;
struct Node* right;
Node(){
int key = 0;
struct Node* left = nullptr;
struct Node* right = nullptr;
}
};
src_root is the address of the root node of the tree.
int BST::countNodes(Node* src_root, int sum){
if((src_root==root && src_root==nullptr) || src_root==nullptr)
return 0;
else if(src_root->left==nullptr || src_root->right==nullptr)
return sum;
return countNodes(src_root->left, sum + 1) + countNodes(src_root->right, sum + 1) + 1;
}
However my code only seems to work if there are 3 nodes. Anything greater than 3 gives wrong answer. Please help me find out what's wrong with it. Thanks!
It is a long time ago since I made anything in C/C++ so if there might be some syntax errors.
int BST::countNodes(Node *scr_root)
{
if (scr_root == null) return 0;
return 1 + countNodes(scr_root->left) + countNodes(scr_root->right);
}
I think that will do the job.
You have several logical and structural problems in your implementation. Casperah gave you the "clean" answer that I assume you already found on the web (if you haven't already done that research, you shouldn't have posted your question). Thus, what you're looking for is not someone else's solution, but how to fix your own.
Why do you pass sum down the tree? Lower nodes shouldn't care what the previous count is; it's the parent's job to accumulate the counts from its children. See how that's done in Casperah's answer? Drop the extra parameter from your code; it's merely another source for error.
Your base case has an identically false clause: src_root==root && src_root==nullptr ... if you make a meaningful call, src_root cannot be both root and nullptr.
Why are you comparing against a global value, root? Each call simply gets its own job done and returns. When your call tree crawls back to the original invocation, the one that was called with the root, it simply does its job and returns to the calling program. This should not be a special case.
Your else clause is wrong: it says that if either child is null, you ignore counting the other child altogether and return only the count so far. This guarantees that you'll give the wrong answer unless the tree is absolutely balanced and filled, a total of 2^N - 1 nodes for N levels.
Fix those items in whatever order you find instructive; the idea is to learn. Note, however, that your final code should look a lot like the answer Casperah provided.
I wanted to have a linked list of nodes with below structure.
struct node
{
string word;
string color;
node *next;
}
for some reasons I decided to use vector instead of list.my question is that is it possible to implement a vector which it's j direction is bounded and in i direction is unlimited and to add more two strings at the end of my vertex.
in other words is it possible to implement below structure in vector ?
j
i color1 color2 …
word1 word2 …
I am not good with C/C++, so this answer will only be very general. Unless you are extremely concerned about speed or memory optimization (most of the time you shouldn't be), use encapsulation.
Make a class. Make an interface which says what you want to do. Make the simples possible implementation of how to do it. Most of the time, the simplest implementation is good enough, unless it contains some bugs.
Let's start with the interface. You could have made it part of the question. To me it seems that you want a two-dimensional something-like-an-array of strings, where one dimension allows only values 0 and 1, and the other dimension allows any non-genative integers.
Just to make sure there is no misunderstanding: The bounded dimension is always size 2 (not at most 2), right? So we are basicly speaking about 2×N "rectangles" of strings.
What methods will you need? My guesses: A constructor for a new 2×0 size rectangle. A method to append a new pair of values, which increases the size of the rectangle from 2×N to 2×(N+1) and sets the two new values. A method which returns the current length of the rectangle (only the unbounded dimension, because the other one is constant). And a pair of random-access methods for reading or writing a single value by its coordinates. Is that all?
Let's write the interface (sorry, I am not good at C/C++, so this will be some C/Java/pseudocode hybrid).
class StringPairs {
constructor StringPairs(); // creates an empty rectangle
int size(); // returns the length of the unbounded dimension
void append(string s0, string s1); // adds two strings to the new J index
string get(int i, int j); // return the string at given coordinates
void set(int i, int j, string s); // sets the string at given coordinates
}
We should specify what will the functions "set" and "get" do, if the index is out of bounds. For simplicity, let's say that "set" will do nothing, and "get" will return null.
Now we have the question ready. Let's get to the answer.
I think the fastest way to write this class would be to simply use the existing C++ class for one-dimensional vector (I don't know what it is and how it is used, so I just assume that it exists, and will use some pseudocode; I will call it "StringVector") and do something like this:
class StringPairs {
private StringVector _vector0;
private StringVector _vector1;
private int _size;
constructor StringPairs() {
_vector0 = new StringVector();
_vector1 = new StringVector();
_size = 0;
}
int size() {
return _size;
}
void append(string s0, string s1) {
_vector0.appens(s0);
_vector1.appens(s1);
_size++;
}
string get(int i, int j) {
if (0 == i) return _vector0.get(j);
if (1 == i) return _vector1.get(j);
return null;
}
void set(int i, int j, string s) {
if (0 == i) _vector0.set(j, s);
if (1 == i) _vector1.set(j, s);
}
}
Now, translate this pseudocode to C++, and add any new methods you need (it should be obvious how).
Using the existing classes to build your new classes can help you program faster. And if you later change your mind, you can change the implementation while keeping the interface.
The function gets an integer and a digit, and should return true
if the digit appears an even number of times in the integer, or false if not.
For example:
If digit=1 and num=1125
the function should return true.
If digit=1 and num=1234
the function should return false.
bool isEven(int num, int dig)
{
bool even;
if (num < 10)
even = false;
else
{
even = isEven(num/10,dig);
This is what I've got so far, and I'm stuck...
This is homework so please don't write the answer but hint me and help me get to it by myself.
To set up recursion, you need to figure out two things:
The base case. What is are the easy cases that you can handle outright? For example, can you handle single-digit numbers easily?
The rule(s) that reduce all other cases towards the base case. For example, can you chop off the last digit and somehow transform the solution for the remaning partial number into the solution for the full number?
I can see from your code that you've made some progress on both of these points. However, both are incomplete. For one thing, you are never using the target digit in your code.
The expression num%10 will give you the last digit of a number, which should help.
Your base case is incorrect because a single digit can have an even number of matches (zero is an even number). Your recursive case also needs work because you need to invert the answer for each match.
This funtion isEven() takes a single integer and returns the true if the number of occurence of numberToCheck is even.
You can change the base as well as the numberToCheck which are defined globally.
#include <iostream>
using std::cout;
using std::endl;
// using 10 due to decimal [change it to respective base]
const int base = 10;
const int numberToCheck = 5;
//Checks if the number of occurence of "numberToCheck" are even or odd
bool isEven(int n)
{
if (n == 0)
return 1;
bool hasNumber = false;
int currentDigit = n % base;
n /= base;
if (currentDigit == numberToCheck)
hasNumber = true;
bool flag = isEven(n);
// XOR GATE
return ((!hasNumber) && (flag) || (hasNumber) && (!flag));
};
int main(void)
{
// This is the input to the funtion IsEven()
int n = 51515;
if (isEven(n))
cout << "Even";
else
cout << "Odd";
return 0;
}
Using XOR Logic to integrate all returns
// XOR GATE
return ((!hasNumber) && (flag) || (hasNumber) && (!flag));
I am newbie in QT(4.7.4) and I am search for function, that checks an QString for alpha-characters and returns "true" if in this QString contains only characters.
Should I write this simple function myself? :( I hope it exists such function as isText() in VBA, but in Google and documentation I have not found it.
Thanks for answers and sorry for my english :)
You can simply validate the string with a QRegExp class matching an alphanumeric string. I suggest to use it with QValidator to be more clear.
You could use something like this (If your goal is to accept only strings, which contains a single character):
bool containsOnly(QString str, QChar c)
{
for(int i=0; i<str.length(); i++)
if(str.at(i)!=c)
return false;
return true;
}
and in use:
bool b = containsOnly("String", 'a');
Alright, so I'm trying to make a Java program to solve a picross board, but I keep getting a Stackoverflow error. I'm currently just teaching myself a little Java, and so I like to use the things I know rather than finding a solution online, although my way is obviously not as efficient. The only way I could think of solving this was through a type of brute force, trying every possibility. The thing is, I know that this function works because it works for smaller sized boards, the only problem is that with larger boards, I tend to get errors before the function finishes.
so char[][] a is just the game board with all the X's and O's. int[][] b is an array with the numbers assigned for the picross board like the numbers on the top and to the left of the game. isDone() just checks if the board matches up with the given numbers, and shift() shifts one column down. I didn't want to paste my entire program, so if you need more information, let me know. Thanks!
I added the code for shift since someone asked. Shift just moves all the chars in one row up one cell.
Update: I'm thinking that maybe my code isn't spinning through every combination, and so it skips over the correct answer. Can anyone verify is this is actually trying every possible combination? Because that would explain why I'm getting stackoverflow errors. On the other hand though, how many iterations can this go through before it's too much?
public static void shifter(char[][] a, int[][] b, int[] clockwork)
{
boolean correct = true;
correct = isDone(a, b);
if(correct)
return;
clockwork[a[0].length - 1]++;
for(int x = a[0].length - 1; x > 0; x--)
{
if(clockwork[x] > a.length)
{
shift(a, x - 1);
clockwork[x - 1]++;
clockwork[x] = 1;
}
correct = isDone(a, b);
if(correct)
return;
}
shift(a, a[0].length - 1);
correct = isDone(a, b);
if(correct)
return;
shifter(a, b, clockwork);
return;
}
public static char[][] shift(char[][] a, int y)
{
char temp = a[0][y];
for(int shifter = 0; shifter < a.length - 1; shifter++)
{
a[shifter][y] = a[shifter + 1][y];
}
a[a.length - 1][y] = temp;
return a;
}
Check Recursive call.and give the termination condition.
if(terminate condition)
{
exit();
}
else
{
call shifter()
}