I have a string that contains some text followed by a blank line. What's the best way to keep the part with text, but remove the whitespace newline from the end?
Use String.trim() method to get rid of whitespaces (spaces, new lines etc.) from the beginning and end of the string.
String trimmedString = myString.trim();
String.replaceAll("[\n\r]", "");
This Java code does exactly what is asked in the title of the question, that is "remove newlines from beginning and end of a string-java":
String.replaceAll("^[\n\r]", "").replaceAll("[\n\r]$", "")
Remove newlines only from the end of the line:
String.replaceAll("[\n\r]$", "")
Remove newlines only from the beginning of the line:
String.replaceAll("^[\n\r]", "")
tl;dr
String cleanString = dirtyString.strip() ; // Call new `String::string` method.
String::strip…
The old String::trim method has a strange definition of whitespace.
As discussed here, Java 11 adds new strip… methods to the String class. These use a more Unicode-savvy definition of whitespace. See the rules of this definition in the class JavaDoc for Character::isWhitespace.
Example code.
String input = " some Thing ";
System.out.println("before->>"+input+"<<-");
input = input.strip();
System.out.println("after->>"+input+"<<-");
Or you can strip just the leading or just the trailing whitespace.
You do not mention exactly what code point(s) make up your newlines. I imagine your newline is likely included in this list of code points targeted by strip:
It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F').
It is '\t', U+0009 HORIZONTAL TABULATION.
It is '\n', U+000A LINE FEED.
It is '\u000B', U+000B VERTICAL TABULATION.
It is '\f', U+000C FORM FEED.
It is '\r', U+000D CARRIAGE RETURN.
It is '\u001C', U+001C FILE SEPARATOR.
It is '\u001D', U+001D GROUP SEPARATOR.
It is '\u001E', U+001E RECORD SEPARATOR.
It is '\u001F', U+0
If your string is potentially null, consider using StringUtils.trim() - the null-safe version of String.trim().
If you only want to remove line breaks (not spaces, tabs) at the beginning and end of a String (not inbetween), then you can use this approach:
Use a regular expressions to remove carriage returns (\\r) and line feeds (\\n) from the beginning (^) and ending ($) of a string:
s = s.replaceAll("(^[\\r\\n]+|[\\r\\n]+$)", "")
Complete Example:
public class RemoveLineBreaks {
public static void main(String[] args) {
var s = "\nHello world\nHello everyone\n";
System.out.println("before: >"+s+"<");
s = s.replaceAll("(^[\\r\\n]+|[\\r\\n]+$)", "");
System.out.println("after: >"+s+"<");
}
}
It outputs:
before: >
Hello world
Hello everyone
<
after: >Hello world
Hello everyone<
I'm going to add an answer to this as well because, while I had the same question, the provided answer did not suffice. Given some thought, I realized that this can be done very easily with a regular expression.
To remove newlines from the beginning:
// Trim left
String[] a = "\n\nfrom the beginning\n\n".split("^\\n+", 2);
System.out.println("-" + (a.length > 1 ? a[1] : a[0]) + "-");
and end of a string:
// Trim right
String z = "\n\nfrom the end\n\n";
System.out.println("-" + z.split("\\n+$", 2)[0] + "-");
I'm certain that this is not the most performance efficient way of trimming a string. But it does appear to be the cleanest and simplest way to inline such an operation.
Note that the same method can be done to trim any variation and combination of characters from either end as it's a simple regex.
Try this
function replaceNewLine(str) {
return str.replace(/[\n\r]/g, "");
}
String trimStartEnd = "\n TestString1 linebreak1\nlinebreak2\nlinebreak3\n TestString2 \n";
System.out.println("Original String : [" + trimStartEnd + "]");
System.out.println("-----------------------------");
System.out.println("Result String : [" + trimStartEnd.replaceAll("^(\\r\\n|[\\n\\x0B\\x0C\\r\\u0085\\u2028\\u2029])|(\\r\\n|[\\n\\x0B\\x0C\\r\\u0085\\u2028\\u2029])$", "") + "]");
Start of a string = ^ ,
End of a string = $ ,
regex combination = | ,
Linebreak = \r\n|[\n\x0B\x0C\r\u0085\u2028\u2029]
Another elegant solution.
String myString = "\nLogbasex\n";
myString = org.apache.commons.lang3.StringUtils.strip(myString, "\n");
For anyone else looking for answer to the question when dealing with different linebreaks:
string.replaceAll("(\n|\r|\r\n)$", ""); // Java 7
string.replaceAll("\\R$", ""); // Java 8
This should remove exactly the last line break and preserve all other whitespace from string and work with Unix (\n), Windows (\r\n) and old Mac (\r) line breaks: https://stackoverflow.com/a/20056634, https://stackoverflow.com/a/49791415. "\\R" is matcher introduced in Java 8 in Pattern class: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
This passes these tests:
// Windows:
value = "\r\n test \r\n value \r\n";
assertEquals("\r\n test \r\n value ", value.replaceAll("\\R$", ""));
// Unix:
value = "\n test \n value \n";
assertEquals("\n test \n value ", value.replaceAll("\\R$", ""));
// Old Mac:
value = "\r test \r value \r";
assertEquals("\r test \r value ", value.replaceAll("\\R$", ""));
String text = readFileAsString("textfile.txt");
text = text.replace("\n", "").replace("\r", "");
I have a asp.net control that is using a regular expression to validate the users input for first name and last name. It works for up to 40 characters...and I think by the looks of the expression it also allows ' for names like O'Donald and maybe hypenated names too.
ValidationExpression="^[a-zA-Z''-'\s]{1,40}$"
My problem is with accented names/characters e.g. Spanish and French names that may contain for example ñ are not allowed. Does anyone know how to modify my expression to take this into account?
You want
\p{L}: any kind of letter from any language.
From regular-expressions.info
\p{L} or \pL is every character in the unicode table that has the property "letter". So it will match every letter from the unicode table.
You can use this within your character class like this
ValidationExpression="^[\p{L}''-'\s]{1,40}$"
Working C# test:
String[] words = { "O'Conner", "Smith", "Müller", "fooñ", "Fooobar12" };
foreach (String s in words) {
Match word = Regex.Match(s, #"
^ # Match the start of the string
[\p{L}''-'\s]{1,40}
$ # Match the end of the string
", RegexOptions.IgnorePatternWhitespace);
if (word.Success) {
Console.WriteLine(s + ": valid");
}
else {
Console.WriteLine(s + ": invalid");
}
}
Console.ReadLine();
How do you strip (HTML) tags from a String in Flex 4.5 / 4.6?
I don't think there's an inbuilt function to strip the tags like in php.
However, you could use a regular expression to remove all text between < and >
var r:RegExp=/<\/??.*?\/??>/g;
I gotta run now, but if you could follow my line of thought:
While the string tests positive for the regexp, replace the occurrence with an empty string
That should remove all occurrences of this type:
<tag>
<tag />
</tag>
EDIT
var h:String="<html><head><title>Hello World</title></head><body><h1>Hello</h1>Hey there, what's new?</body></html>";
var r:RegExp=/<\/??.*?\/??>/s; //s=dotall to match . to newline also
while(r.test(h)) {
h=h.replace(r, ""); //Remember, strings are immutable, so you h.replace will not change the value of h, so you need to reassign the return to h
}
trace(h);
OUTPUT:
Hello WorldHelloHey there, what's new?
I am trying to write a regular expression that doesn't allow single or double quotes in a string (could be single line or multiline string). Based on my last question, I wrote like this ^(?:(?!"|').)*$, but it is not working. Really appreciate if anybody could help me out here.
Just use a character class that excludes quotes:
^[^'"]*$
(Within the [] character class specifier, the ^ prefix inverts the specification, so [^'"] means any character that isn't a ' or ".)
Just use a regex that matches for quotes, and then negate the match result:
var regex = new Regex("\"|'");
bool noQuotes = !regex.IsMatch("My string without quotes");
Try this:
string myStr = "foo'baa";
bool HasQuotes = myStr.Contains("'") || myStr.Contains("\""); //faster solution , I think.
bool HasQuotes2 = Regex.IsMatch(myStr, "['\"]");
if (!HasQuotes)
{
//not has quotes..
}
This regular expression below, allows alphanumeric and all special characters except quotes(' and "")
#"^[a-zA-Z-0-9~+:;,/#&_#*%$!()\[\] ]*$"
You can use it like
[RegularExpression(#"^[a-zA-Z-0-9~+:;,/#&_#*%$!()**\[\]** ]*$", ErrorMessage = "Should not allow quotes")]
here use escape sequence() for []. Since its not showing in this post
I need to parse a sentence.
If I find a hash in the sentence, I would like to bold it.
Example : Bonjour #hello Hi => Bonjour #hello Hi
Seems like a good situation for regex
I'd do something like this
boldHashes("Bonjour #hello Hi");
...
private string boldHashes(string str)
{
return Regex.Replace(str, #"(#\w+)", "<strong>$1</strong>");
}
In this case we're matching a literal hash # plus a word of any length \w+ and group it between () so we can use the $1 substitions in the Regex.Replace function
Updated jQuery doing the same thing.
Something like:
HTML
<div id="myDiv">Bonjour #Hello hi</div>
jQuery
$('#myDiv').html($('#myDiv').text().replace(/(#\w+)/g, '<strong>$1</strong>'));
There's probably a more elegant way to do this but you can use the String.IndexOf method to find the first instance of the hash like so
String myString = "Bonjour #hello hi";
int index = myString.IndexOf('#');
if(index>-1) //IndexOf returns -1 if the character isn't found
{
//search for the next space after the hash
int endIndex=mystring.IndexOf(' ',index+1)
myString=MakeBold(myString,index,endIndex);
}
All that's left for you is to implement the MakeBold function.