makes RSS feed fail validation - rss

Im making an RSS feed from content that's been imported into my site. The content contains , which makes the RSS feed fail validation. I'm trying to strip out the but I'm not sure ill be able to. The feed looks fine in google reader but IE's reader just displays an error.
Are their any other solutions to my problem? Could I make the doc type less strict or something similar?
Thanks

You shouldn't need at all if the content will be visible in an rss feed. If you're using PHP to generate the feed, you can use str_replace to strip it.
$stripped = str_replace(" ", " ", $original);
So if you're iterating thru a loop, $original will be the var with the original data, when outputting the data, use $stripped instead. It will replace the $nbsp; with a single space.

You could replace all entities with  .

Related

What is the optimal way to encode a tag to use in algolia?

I have an article page that lists the tags related to that article. When the user clicks on the tag it brings them to the Algolia search results page. This is a Wordpress website. Some of my tags happen to contain ampersands in them like "Spades & Shovels", for example.
What I've noticed is that when I urlencode this term it does not display the search term properly in the search box when I send it via a query string.
I've tried this and thought this was the secret sauce, but it doesn't always work.
$tag_name = json_encode(urlencode(html_entity_decode($tag->name)));
What would be the best way to encode a tag name so that when I pass it via a query string to the search results page it handles it properly?
I've done more testing and I am noticing some odd things on my search results page. If I come to the search page with the searchbox field empty, but pass a post_id (I'm dynamically loading the tags associated with a post in the filter section), I can see "Spades & Shovels" listed there and it has a count of "65" next to it.
If I type "Spades & Shovels" into the searchbox, I see that number quickly drop down to 10 in the sidebar.
When I pass the tag in the query string no matter how I encode it, it doesn't seem to work. I mean I see the words Spades & Shovel in the search box, but I don't get any results. Its very strange, but probably something simple I'm hoping to fix. I need to be able to pass & in a query string to my search results page, but I have not found the proper formula for sending an ampersand in the url for this to work.
It does seem like the value I am passing through in the query string is not an exact match to the actual tag name.
I have tried all of these possibilities and different combinations:
$clean_tag = json_encode(urlencode(html_entity_decode($tag->name)));
$clean_tag = htmlspecialchars_decode($clean_tag);
$clean_tag = html_entity_decode($clean_tag);
$clean_tag = urlencode($clean_tag);
$clean_tag = htmlentities($clean_tag);
$clean_tag = html_entity_decode($clean_tag);
$clean_tag = json_encode($clean_tag);
None of these seem to do the trick. Any thoughts?

Create custom URL for WordPress RSS2 Feed

I wrote an RSS2 feed on WordPress a while back, but for some reason, some of the URLs aren't working anymore. The current version of WP is 4.7.2.
For example, https://justhoodsbyawdis.com/product/jh001/feed/ works, but https://justhoodsbyawdis.com/brands/feed/ does not.
Note that https://justhoodsbyawdis.com/product/jh001/ is a valid page on the site, but that https://justhoodsbyawdis.com/brands/ is not, because it is only valid for feeds. The latter results in an "ERROR: This is not a valid feed." message.
Is there a way to make an URL for a RSS2 feed, even without an associated WP page (i.e. without the "/feed/" at the end).
Thanks!
Rob
EDIT 1:
I added a post called "brands", which fixed the problem. The only thing is that the dummy post is viewable by anyone. Any ideas how to block it, but not the feed?
Another problem is that query strings break the feed, for instance:
https://justhoodsbyawdis.com/products/feed/?name=hoodies
doesn't work, although it does without the "?name=hoodies".
How would I make that work?
EDIT 2:
It would appear that the name query string parameter is now causing problems - see:
https://codex.wordpress.org/Function_Reference/register_taxonomy#Reserved_Terms
Is there a way to make it backwards compatible? Otherwise, the existing app that calls the feed will also have to be changed...
I wound up creating dummy pages to fix the invalid feed error.
I had to change the "name" query string parameter to "prod_name" so as to not conflict with reserved terms.
Rob

Scraping a page to retrieve prices, currency code messing things up

I'm scraping a page using PHP Simple HTML DOM Parser and I want to retrieve the price. It's been going well except for a page that I've encountered, where the html reads:
<p class="was-price">Was: £220.00</p>
I want to scrape the part that reads 220.00 and I am very confused about how to retrieve it. Thus far I have been using preg_replace() with great success to strip out text from a string, yet this is the first time I have come across a currency symbol in numeric format.
Today is the first day I have used preg_replace() and it's confusing to say the least. Can it be used to remove currency symbols in this way? Or should I be looking at another method? Thanks
Use html_entity_decode() to decode encoded html entities. Then you apply preg_replace().
$str = '<p class="was-price">Was: £220.00</p>';
$str = html_entity_decode($str);
echo $str;
preg_replace(...);

Infowindow HTML code showing from JSON query

I'm pulling JSON data for display on a map, but when I grab a JSON field that has HTML code (ie, <br>) the code shows in the infowindow instead of rendering as a line break.
http://www.yourmapper.com/demo/v3infowindow.htm
Any ideas on how to force rendering?
The JSON call is in the code, but here it is directly:
http://www.yourmapper.com/api/markers.php?&lat=38.23282191&lon=-85.7209389&id=152&f=json
Thanks. (Note, I posted this in the old Google Maps Group before noticing that tech questions have been moved here).
The JSON doesn't contain <br> , it contains <br> , this issue is not forced by google-maps.
You may "decode" the strings by changing this line:
yourdescription[i] = item.description;
into
yourdescription[i] = $('<div/>').html(item.description).text();
But if I may assume that you are the developer of this service you better fix this on serverside and return the expected result.

ASP.NET special character problem

I'm building an automated RSS feed in ASP.NET and occurrences of apostrophes and hyphens are rendering very strangely:
"Here's a test" is rendering as "Here’s a test"
I have managed to circumvent a similar problem with the pound sign (£) by escaping the ampersand and building the HTML escape for £ manually as shown in in the extract below:
sArticleSummary = sArticleSummary.Replace("£", "&pound;")
But the following attempt is failing to resolve the apostrophe issue, we stil get ’ on the screen.
sArticleSummary = sArticleSummary.Replace("’", "&#146;"")
The string in the database (SQL2005) for all intents and purposes appears to be plain text - can anyone advise why what seem to be plain text strings keep coming out in this manner, and if anyone has any ideas as to how to resolve the apostrophe issue that'd be appreciated.
Thanks for your help.
[EDIT]
Further to Vladimir's help, it now looks as though the problem is that somewhere between the database and it being loaded into the string var the data is converting from an apostrophe to ’ - has anyone seen this happen before or have any pointers?
Thanks
I would guess the the column in your SQL 2005 database is defined as a varchar(N), char(N) or text. If so the conversion is due to the database driver using a different code page setting to that set in the database.
I would recommend changing this column (any any others that may contain non-ASCII data) to nvarchar(N), nchar(N) or nvarchar(max) respectively, which can then contain any Unicode code point, not just those defined by the code page.
All of my databases now use nvarchar/nchar exclusively to avoid these type of encoding issues. The Unicode fields use twice as much storage space but there'll be very little performance difference if you use this technique (the SQL engine uses Unicode internally).
Transpires that the data (whilst showing in SQLServer plain) is actually carrying some MS Word special characters.
Assuming you get Unicode-characters from the database, the easiest way is to let System.Xml.dll take care of the conversion for you by appending the RSS-feed with a XmlDocument object. (I'm not sure about the elements found in a rss-feed.)
XmlDocument rss = new XmlDocument();
rss.LoadXml("<?xml version='1.0'?><rss />");
XmlElement element = rss.DocumentElement.AppendChild(rss.CreateElement("item")) as XmlElement;
element.InnerText = sArticleSummary;
or with Linq.Xml:
XDocument rss = new XDocument(
new XElement("rss",
new XElement("item", sArticleSummary)
)
);
I would just put "Here's a test" into a CDATA tag. Easy and it works.
<![CDATA[Here's a test]]>

Resources