Search for duplicates in texts with math equations [closed] - math

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
My employer asked me to do a project for our local team. Actually, it will be a way to help our work to finished faster.
We have a local database where we add exercises divided in two fields. The question and the solution. My employer wants since we are a team and we work at the same time, to create a system like stackoverflow's similar questions. When one of the team tries to submit a new data in the database, then it will check if there are other fields which may be duplicates.
The reason he asked me is because I have done something similar in the past but only for text using techniques like TF-IDF and Latent Semantic Analysis. But now, since the math symbols are all in Latex, I cannot find a way to check for duplicates.
I have tried to apply TF-IDF to the text only, but it doesn't work.
Any suggestion?
Edit:
Sorry for the broad topic. I will try to give more examples about my problem.
All the texts are exercises of primary and secondary schools. It is a mix of text and numbers-equations-symbols. If there were only text, I could use TF-IDF to find possible duplicates. Now, several exercises have a little or are without text.
Examples:
1) a. Solve the following equation: (x+1)*(x-1) = 5
b. Find the x: x^2 - 1 = 5
They are the same equation but with a different expression. So, I don't want to mark them as duplicates.
2) a. Solve the following equation: 3x + 7 = 12
b. Find the solution: 7 + 3x = 12
c. Find the x: 3x = 12 - 7
a and b should be duplicate whereas the c will not be.

You could try using MathJax to convert the LaTeX equation into MathML an XML format. You could then use tools to examine that structure. There are probably a few other tools which can convert your equation into some kind of tree structure.
Equality of mathematical expressions is a complex problem. There are question that should you treat (x+1)*(x-1) as being equal to x^2-1, algebraically they are the same.
You might want to investigate computer algebra systems which have a lot of sophisticated features for manipulating expressions.
One technique is to evaluate the expression at a number of points. If the values agree then its a good indication that the expressions are the same.
It might be easier to give a better answer if there was some idea of the type of problems you are working with, polynomials, integrals etc.?

Related

What would be a best practice to program mathematical calculations? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm programming some financial software needing to program not very complicated mathematical formulas.
After writing the code, it is not readable anymore - i.e - you can't easily discern what was the original formula.
What would be a good way to program mathematical formulas so they could be easily read later on?
For example, programming a calculation for a loan with a fixed interest rate:
(TotalValue*monthlyInterest*Math.pow((1+monthlyInterest),totalPayments))/(Math.pow((1+monthlyInterest),totalPayments) - 1)
Though using meaningful variables, the formula is not readable. But if you will look at this formula written in a classical mathematical notation on a page - you will easily know what's going on (really basic math).
How would you even take this formula and write it in a readable way.
Clarification
I'm not talking about any specific language. This should be the same for any high-level language.
The example uses Javascript.
You could also add local variables that are abbreviations of the long name. As they are local, their definition will be right above the formula. This could look like
TV = TotalValue;
mi = monthlyInterest;
N = totalPayments;
MR = ( TV*mi*Math.pow( 1+mi, N ) ) / ( Math.pow( 1+mi, N ) - 1)
MonthlyRate = MR;
where you then see that you can simplify the formula to
MR = (TV*mi) / ( 1 - Math.pow( 1+mi , -N) - 1)
As far as i know, it is not possible to convert Excel formulas to equations
But for clarity, you could recreate the Equation in the Equation Editor and put it next to the cell. This way it is immediately obvious what you are doing.
Another way would be to create the Mathematical Equation and add a macro. This way you could calculate your values in VBA.

How to create a mathematical function from data plots [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I am by no means a math person, but I am really trying to figure out how create a graphable function from some data plots I measure from a chemical titration. I have been trying to learn R and I would like to know if anyone can explain to me or point me to a guide to create a mathmatic function of the titration graph below.
Thanks in advance.
What you are looking for is a Interpolation. I'm not a R programmer, but I'll try to answer anyway.
Some of the more common ways to achieve this function you want is by Polynomial Interpolation which usually gives back a Nth degree polynomial function, where N is the number of data points minus one (1 point gives a constant, 2 points make a line, 3 makes a*x^2 + b*x + c and so on).
Other common alternatives I've learn are used in Computer Graphics are Splines, B-spline, Bézier curve and Hermite interpolation. Those make the curve smoother and good looking (I've told they originated in the car industry so they are less true to the data points).
TL;DR: I've found evidence that there is a implementation of spline in R from the question Interpolation in R which may lead you to your solution.
Hope you get to know better your tool and do a great work.
When doing this kind of work in Computer Science we call it Numerical Methods (at least here in my university), I've done some class and homework in this area while attending to the Numerical Methods Course (it can be found at github) but it's nothing worth noting.
I would add a lot of links to Wikipedia but StackOverflow didn't allow it.

Automate searching of pages such as Wikipedia [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
my question is rather general and not quite specific to wikipedia only, I would like to know is there a way to automate gneration and selection of search results. To give an eample of what I intend:
Let's say I'd like to write articles about American Food and I'd like to read information, such as ingredients, texture, cuisine(County-wise), preparation methods, etc. about approximately 500 different American foods. Let's say these are all available on Wiki too and I have an excel sheet with the names of these dishes and columns specifying their properties. But I dont want to manually look up these dishes/food-iems, can I automate this process? I am looking for some general guidance, some open-source links, some pseudo-code or algorithmic approach to this problem. Any help is appreciated.
Thanks.
P.S.: It'd be great if the logic had some links to help in carrying it out using R, since the other aspects of my project have already been built in R. Also i'd like to broaden my searches to include other major information gathering sites/search engines.
You can do it relatively quickly with use of the WikipediR package:
require(WikipediR)
phrs <- c("car","house")
j <- 1
for (i in phrs) {
pgs[j] <- page_content("en", "wikipedia", page_name = i, as_wikitext = TRUE)
j <- j + 1
}
The solution rather fortuitously assumes that your food names correspond to the page names on Wikipedia. Most probably this won't be the case for all the items. You may consider using the pages_in_category in order to source more pages at once. I presume that I would fist match my list against pages_in_category for a given category (foods) and if the number of errors is insignificant progressed to matching the data.

examples to compare tradtional math notations vs APL/J notations [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I am reading a review to compare Mathematica to APL/J. One question raised in the article seems very interesting to me:
Is Mathematica really the way to go to express our creative thoughts –
viz back to a 17th century notation designed for parchment instead of
forward to a twentieth-century one designed for computers?
Can one share examples of Iverson's notation vs traditional math notation to demonstrate the edge of APL/J on expressing and solving math problems? This would be greatly helpful for new comers.
One example: Alternating series.
Alternating sum is very common in mathematics. But it is cumbersome to put the sign before each term:
in APL and J, because of the order of operations, it is
-/a
I recommend reading Iverson's paper Notation as a Tool of Thought, kindly provided by the J folks. It deals precisely with this issue.
In it you'll find many Math proofs derived using APL instead of the classical notation, along with accompanying commentary. Here's a redacted example, proving Gauss's formula for the arithmetic series:
+/⍳n
+/⌽⍳n ⍝ as + is associative and commutative
((+/⍳n)+(+/⌽⍳n))÷2 ⍝ as x=(x+x)÷2
(+/(⍳n)+(⌽⍳n))÷2 ⍝ as + is associative and commutative
(+/(n/n+1))÷2 ⍝ summing each respective x∊⍳n and y∊⌽⍳n, y=n+1-x → (x+y)=n+1
(n×n+1)÷2 ⍝ per definition of × (times)
Other articles by Iverson, Hui and friends are also illuminating. Again, the J folks provide a notable library.

Which program to solve integration = 0 for a variable? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
i'd like to know how to solve a definite integral in Mathematica.
I do know all variables except b, and need to solve for F(b)=0.
How can i solve it in Mathematica?
Here is my try:
NSolve[Integrate[1/(8*(1 - ff) (2 Pi)^0.5) E^(-0.5*((x - 1.1)/(1 - ff)/8)^2), {x, 0, 9999}] == -0.44531779637243296, ff]
These integrals can be trivially expressed in terms of an error function: Wiki, Mathworld. Hence what you need here is a library to (i) calculate error functions, (ii) numerically solve non-linear equations. Virtually any language has this, so pick anything you're familiar with. In Mathematica, look up Erf and NSolve.
I'd start by plugging it into Wolfram Alpha and see what it gives you.
Mathematica should be able to do it. I think of statistics first when R comes up; I don't know about its calculus capabilities. Excel is not the first choice.
If I were you, I'd be less worried about the software and more worried about the solution itself. A function of this form might be well known. Plot each one and visually check to see what the functions look like and how easy they might be to integrate.
Like this:
http://www.wolframalpha.com/input/?i=graph+exp%28-%28%28x%2B5%29%2F1.5%29%5E2%29
You should be wondering why it's three similar looking integrals. Those singularities in the plot tell you why.
If there's no closed form solutions, you'll have to go with a numerical one. You'll have to choose an algorithm (simple Euler or Runga Kutta or something else), interval sizes, etc. You'll want to know about singular points and how best to tackle them.
Choosing a package is just the start.
You might find http://r.789695.n4.nabble.com/calculus-using-R-td1676727.html helpful.

Resources