Step-by-step: How to show a feed of Medium articles on your WordPress website (without a plugin).

Ascend
12 min readAug 23, 2023

--

Photo by Austin Chan on Unsplash

Full disclosure: you will need some development skills, though.

This is a two-part exercise too — the first will simply look at how we can include Medium articles in our WordPress website. Our second instalment will show you how to go one step further: improving load times by locally storing necessary content in your WordPress instance on-the-fly.

Despite us championing Medium ourselves for all our article writing purposes, we do want to market our articles to our website visitors and demonstrate our thought-leadership, spreading the reach of our content to as many people as possible.

Ok, let’s get started.

Prerequisites

You’ll need the following:

  • A WordPress installation (we recommend a local version whilst you build/test the integration)
  • Some understanding of WordPress coding — we won’t be given beginners guide lessons to every method in our code examples.
  • Some type of IDE/code editor: we use PHPStorm
  • At least PHP 8.0. We’ll be using null-safe operator as part of our code, so it won’t work in anything less. That being said, you can adapt the code if you’re familiar enough.
  • Advanced Custom Fields — simply because it makes adding custom fields easier. This will feature more in our second article, though.
  • TailwindCSS — we’ll be using it in our code examples, you can fetch a copy via npm, link to it from the CDN if you don’t fancy that or style the output as you’d prefer by changing the code template we’ve provided.

What we’re wanting to achieve

The final output is going to look something like this.

Example output of the Medium feed on a website
Desired output of Medium articles on our website

What we’re achieving is a very customised look and feel — very different to Medium’s, without having to move all our article writing onto our own website. Some may say, the best of both worlds.

Example feed response

We’ll share an example here as some of the development choices we have made are because of how this response looks.

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
<channel>
<title>
<![CDATA[Stories by Ascend on Medium]]>
</title>
<description>
<![CDATA[Stories by Ascend on Medium]]>
</description>
<link>https://medium.com/@ascend-agency?source=rss-b15225ed0a8e------2</link>
<image>
<url>https://cdn-images-1.medium.com/fit/c/150/150/1*K9u52eX09zvbltN0tXP13Q.jpeg</url>
<title>Stories by Ascend on Medium</title>
<link>https://medium.com/@ascend-agency?source=rss-b15225ed0a8e------2</link>
</image>
<generator>Medium</generator>
<lastBuildDate>Tue, 22 Aug 2023 22:48:24 GMT</lastBuildDate>
<atom:link href="https://medium.com/@ascend-agency/feed" rel="self" type="application/rss+xml"/>
<webMaster>
<![CDATA[yourfriends@medium.com]]>
</webMaster>
<atom:link href="http://medium.superfeedr.com" rel="hub"/>
<item>
<title>
<![CDATA[Serving Laravel from public to prevent making too much public]]>
</title>
<link>https://ascend-agency.medium.com/serving-laravel-from-public-to-prevent-making-too-much-public-51fb5994c584?source=rss-b15225ed0a8e------2</link>
<guid isPermaLink="false">https://medium.com/p/51fb5994c584</guid>
<category>
<![CDATA[laravel-security]]>
</category>
<category>
<![CDATA[website-hacking]]>
</category>
<category>
<![CDATA[exploit-development]]>
</category>
<category>
<![CDATA[laravel]]>
</category>
<category>
<![CDATA[laravel-development]]>
</category>
<dc:creator>
<![CDATA[Ascend]]>
</dc:creator>
<pubDate>Tue, 22 Aug 2023 11:54:45 GMT</pubDate>
<atom:updated>2023-08-22T11:54:45.628Z</atom:updated>
<content:encoded>
<![CDATA[ ... all our article's content ... ]]>
</content:encoded>
</item>
</channel>
</rss>

For our articles, we’re interested in the item tags. Whilst we may use some of the other data in our actual integration to populate content around the articles on the website, they‘re not the focus of this specific guide.

Reading the RSS feed from Medium

First, we need to read the RSS feed from Medium. There’s two formats you can use:

  1. https://[username].medium.com/feed
  2. https://medium.com/@[username]/feed

We’ll assume there’s an ACF URL field set-up for the specific page with the field key medium_rss_feed.

$xml = simplexml_load_string(
file_get_contents(
get_field('medium_rss_feed') // https://ascend-agency.medium.com/feed
)
);

In reality, this would probably be in an Option page or set in a global space so that it only needs to be defined once. The call then would look something like get_field('medium_rss_feed', 'option') but since Option pages aren’t included in the free version of ACF, we’ve omitted that from the example.

Looping through the response

Looping through the data we get back in return couldn't be easier. You can test how successful you’ve been in your quest so far with the below code, which would output the title of the articles retrieved.

foreach($xml->channel->item as $item) {
echo (string) $item->title . "<br />";
}

Once you know it’s working, your can start to create a more meaningful output with the data you’ve got. We will of course want to link off to the article by calling the $item->link attribute and we even want to make use of the tags by looping through the $item->category values we have been given for each article.

First though, let’s focus on getting content as you’ll notice its naming structure is different in the response.

Namespacing response attributes

In the SimpleXML representation of XML, if an element or attribute name contains characters like colons (:) which aren't valid in PHP variable names, we can opt to use the array-style syntax to access them instead.

Given that content:encoded is the element in the RSS feed's <item>, you would be forgiven for believing you could retrieve its value like this:

foreach($xml->channel->items as $item) {

...

$item->{'content:encoded'};
}

However, this didn’t play ball for us. Why? Well, {'content:encoded'} returning an empty value could have been for several reasons. To save you the time though, the actual reason was due to namespacing.

RSS feeds often use namespaces — especially when they make use of custom tags that not standard to RSS.

To overcome this issue, we introduced the following code block after reading the XML string.

$namespaces = $xml->getNamespaces(true);
foreach ($namespaces as $prefix => $ns) {
$xml->registerXPathNamespace($prefix, $ns);
}

How does this affect how we access the content:encoded attribute?

foreach($xml->channel->items as $item) {

...

$encodedContent = (string) $item->children($namespaces['content'])->encoded;
}

Voila. We now have the content we’ve been craving.

Accessing featured images

We put all our featured images in Medium at the top of our articles, which is what the below PHP code has been tested for. If you have your desired featured article elsewhere, you may need to play around with some of the logic to get it work how you want.

Anyway, let’s analyse what we have in this example though and step-by-step think about what we want to do:

<content:encoded>
<![CDATA[
<h3>Serve Laravel from public to prevent making too much public</h3>
<h3>TL;DR: Serve a website from the project root instead of the public directory and you will expose your .env file, potentially leading to huge security issues with sensitive data leakage.</h3>
<figure>
<img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OS1JepYWNppEKqIUT3K58Q.jpeg" />
<figcaption>
Photo by <a href="https://unsplash.com/@towfiqu999999?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Towfiqu barbhuiya</a> on<a href="https://unsplash.com/photos/em5w9_xj3uU?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a>
</figcaption>
</figure>
...
]]>
</content:encoded>

We’ve updated the code above to make it more legible as we talk through it.

We have our featured image, it’s in the src attribute of the img tag inside of the figure element.

We want to:

  1. Parse the HTML code within our PHP loop
  2. Find that img tag
  3. Extract the src attribute

Ladies & Gentlement, it’s DOMDocument

To extract the image src from the HTML string inside of content:encoded, we’re going to utilise the DOMDocument and DOMXPath classes in PHP.

Here's how we’ll extract the src of the img tag:

foreach($xml->channel->items as $item) {

...

$encodedContent = (string) $item->children($namespaces['content'])->encoded;

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($encodedContent);
libxml_clear_errors();

$xpath = new DOMXPath($doc);

$src = $xpath->query('//img/@src');
if (!$src->length) {
throw new RuntimeException('No images found');
}

$image = $src[0]?->nodeValue
}

Let’s deep dive into a few lines of code.

// we set this to true so that we supress warnings/errors related to invalid HTML
libxml_use_internal_errors(true);

We’ll then do some housekeeping:

// removes any supressed content that's been added to the libxml error buffer
libxml_clear_errors();

Not strictly essential — but helpful to our cause as we test — is the error we throw if no images are found:

$src = $xpath->query('//img/@src');
if (!$src->length) {
throw new RuntimeException('No images found');
}

This just draws our attention to something being wrong in the
$xpath->query() call for the img src attribute if the length of the array is empty. The reality is that this isn’t strictly needed as we’re using the null-safe operator beneath it.

$image = $src[0]?->nodeValue

Displaying an output

OK. So we’ve done a lot of the hard work. Ironically, we’re not going to do much with the data from within the content:encoded tags other than use it to get our featured image.

You could — if you wanted — use a snippet for an excerpt. You may wish to consider using wp_trim_words to limit the output.

Back on task though, here’s our complete code so far:

$xml = simplexml_load_string(
file_get_contents(
get_field('medium_rss_feed') // https://ascend-agency.medium.com/feed
)
);

$namespaces = $xml->getNamespaces(true);
foreach ($namespaces as $prefix => $ns) {
$xml->registerXPathNamespace($prefix, $ns);
}

$articles = []; // we'll use this variable to hold the data we want to output
foreach($xml->channel->item as $item) {

$encodedContent = (string) $item->children($namespaces['content'])->encoded;

libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($encodedContent);
libxml_clear_errors();
$xpath = new DOMXPath($doc);
$src = $xpath->query('//img/@src');

$articles[] = [
'title' => (string) $item->title,
'link' => (string) $item->link,
'featured_image' => $src[0]?->nodeValue,
'published' => (string) $item->children($namespaces['atom'])->updated,
];

}

Why everything is type-cast as (string)

Great question.

When you work with SimpleXMLElement objects in PHP, you're not directly working with strings or scalar values. Instead, you're interacting with objects that represent XML nodes. These objects contain methods and properties that help manage and manipulate the underlying XML structure.

When you access the text content of a node, even though it may visually look like a string, it is still wrapped in a SimpleXMLElement object. The need to type-cast it to a string is so that you get the actual string value of the node rather than working with the object representation of the node.

If you don’t type-cast, you’ll cause a fatal error. The type-casting is necessary because PHP doesn’t automatically convert SimpleXMLElement objects to strings in all contexts.

Some functions and operations in PHP will trigger the object's __toString() method, causing it to behave as if you manually cast it to a string. And this is how it can get confusing, quickly. To ensure consistent behaviour across all parts of our code, we’re applying good practice by manually casting each element to strings when given this is what we’re after.

A little refactoring

The whole piece of code is starting to get a bit bloated. The reality is this functionality belongs in a Utility class or Service, which would be responsible for the entire logic.

For this example though, we’ve just refactored what we have in a procedural way for ease:

function getFeaturedImage(string $encodedContent): string
{
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($encodedContent);
libxml_clear_errors();
$xpath = new DOMXPath($doc);
$src = $xpath->query('//img/@src');

return $src[0]?->nodeValue ?? '';
}

$xml = simplexml_load_string(
file_get_contents(
get_field('medium_rss_feed') // https://ascend-agency.medium.com/feed
)
);

$namespaces = $xml->getNamespaces(true);
foreach ($namespaces as $prefix => $ns) {
$xml->registerXPathNamespace($prefix, $ns);
}

$articles = []; // we'll use this variable to hold the data we want to output
foreach($xml->channel->item as $item) {

$encodedContent = (string) $item->children($namespaces['content'])->encoded;

$articles[] = [
'title' => (string) $item->title,
'link' => (string) $item->link,
'featured_image' => getFeaturedImage($encodedContent),
'published' => (string) $item->children($namespaces['atom'])->updated,
];

}

We now have an array of data we want in the variable $articles. The contents looks something like this:

[
[
"title" => "Serving Laravel from public to prevent making too much public",
"link" => "https://ascend-agency.medium.com/serving-laravel-from-public-to-prevent-making-too-much-public-51fb5994c584?source=rss-b15225ed0a8e------2",
"featured_image" => "https://cdn-images-1.medium.com/max/1024/1*OS1JepYWNppEKqIUT3K58Q.jpeg",
"published" => "2023-08-22T11:54:45.628Z",
],
[
"title" => "Async-or-swim evolution? Asynchronous PHP: Non-Blocking Code Execution",
"link" => "https://ascend-agency.medium.com/async-or-swim-evolution-asynchronous-php-non-blocking-code-execution-c75b285b2bbb?source=rss-b15225ed0a8e------2",
"featured_image" => "https://cdn-images-1.medium.com/max/1024/1*slZR4US7-L7TkkIUB9cY9Q.jpeg",
"published" => "2023-08-22T10:42:57.621Z",
],
[...],
[...],
];

All that’s left to do now is work through the array and put it into our design.

<div class="grid lg:grid-cols-3">
<?php foreach($articles as $article) : ?>
<article class="relative isolate flex flex-col justify-end overflow-hidden rounded-2xl bg-neutral-900 px-8 pb-8 pt-80 sm:pt-48 lg:pt-80">
<img
class="absolute inset-0 -z-10 h-full w-full object-cover"
src="<?php echo $article['featured_image']; ?>"
alt="Image for <?php echo $article['title']; ?>"
/>
<div class="absolute inset-0 -z-10 bg-gradient-to-t from-neutral-900 via-neutral-900/40"></div>
<div class="absolute inset-0 -z-10 rounded-2xl ring-1 ring-inset ring-neutral-300/10"></div>
<div class="flex flex-wrap items-center gap-y-1 overflow-hidden text-sm leading-6 text-neutral-300">
<time datetime="<?php echo $article['published'] ?>" class="sr-only"><?php echo (new DateTime($article['published']))->format('d F Y H:i'); ?></time>
</div>
<h3 class="mt-4 text-lg font-semibold leading-6 text-white">
<a href="<?php echo $article['link']; ?>" target="_blank">
<span class="absolute inset-0"></span>
<?php echo $article['title']; ?>
</a>
</h3>
</article>
<?php endforeach; ?>
</div>

There’s a few things we’d want to do here. But as it’s the first time you’re seeing the template we’ve kept the it pretty clean and on-topic.

For reference though, you should consider the following:

Sanitise the Medium values before they’re output.

<a href="<?php echo esc_url($article['link'], 'https'); ?>" target="_blank">
<span class="absolute inset-0"></span>
<?php echo esc_attr($article['title']); ?>
</a>

NEVER trust user inputs. Even if you trust the source they’re coming from, always escape dynamic data.

esc_attr encodes the <, >, &, ” and ‘ (less than, greater than, ampersand, double quote and single quote) characters.

esc_url removes a number of characters from the URL. An empty string is also returned if $url specifies a protocol other than those in $protocols.

Remove the DateTime call

Sidebar: we love Laravel. That’s no secret. But what it’s also done for our WordPress mentality is excellent. We’re forever wanting to emulate the MVC way and have views in WordPress (as close as can be) responsible for nothing more than presenting information.

This line, for example, troubles us by going somewhat against that principle:

(new DateTime($article['published']))->format('d F Y H:i')

There are two solutions here:

  1. Move this logic inside of our $xml->channel->items loop and store it inside of the $article['published'] value, most likely making this an array alongside the current timestamp. This approach gives you the control over the format of the date/time ouput
  2. If you’re happy with the format that we get from Medium though, you could include that with minimal fuss. Again, we’d probably put that inside $article['published'] too.

Here’s how both of those options could look:

// Option 1
$articles[] = [
...
'published' => [
'timestamp' => (string) $item->children($namespaces['atom'])->updated,
'for_humans' => (new DateTime($item->children($namespaces['atom'])->updated))->format('d F Y H:i')
],
];

// Option 2
$articles[] = [
...
'published' => [
'timestamp' => (string) $item->children($namespaces['atom'])->updated,
'for_humans' => (string) $item->pubDate,
],
];

Work with the default date and time formats set in WordPress

If you fancy option 1 and want to be super clean and uniform, you can easily match the date time output format elsewhere appearing in WordPress by using the values configurable via the CMS.

This also means your users have flexibility to change the readable output without any developer intervention required.

This is how Option 1 would be adapted to accommodate that:

$dateFormat = get_option('date_format');
$timeFormat = get_option('time_format');

$dateTimeFormat = sprintf('%s %s', $dateFormat, $timeFormat);

...

// Option 1
$articles[] = [
...
'published' => [
'timestamp' => (string) $item->children($namespaces['atom'])->updated,
'for_humans' => (new DateTime($item->children($namespaces['atom'])->updated))->format($dateTimeFormat)
],
];

Here’s a final example of the code we used (including template and a multi-dimensional published array.

<?php
function getFeaturedImage(string $encodedContent): string
{
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($encodedContent);
libxml_clear_errors();
$xpath = new DOMXPath($doc);
$src = $xpath->query('//img/@src');

return $src[0]?->nodeValue ?? '';
}

$xml = simplexml_load_string(
file_get_contents(
get_field('medium_rss_feed') // https://ascend-agency.medium.com/feed
)
);

$namespaces = $xml->getNamespaces(true);
foreach ($namespaces as $prefix => $ns) {
$xml->registerXPathNamespace($prefix, $ns);
}

$articles = []; // we'll use this variable to hold the data we want to output

// calling this here so it's not making multiple get_option calls inside our loop
$dateTimeFormat = sprintf(
'%s %s',
get_option('date_format'),
get_option('time_format')
);

foreach($xml->channel->item as $item) {

$encodedContent = (string) $item->children($namespaces['content'])->encoded;

$articles[] = [
'title' => (string) $item->title,
'link' => (string) $item->link,
'featured_image' => getFeaturedImage($encodedContent),
'published' => [
'timestamp' => (string) $item->children($namespaces['atom'])->updated,
'for_humans' => (new DateTime($item->children($namespaces['atom'])->updated))->format($dateTimeFormat)
],
];

}
?>
<div class="grid lg:grid-cols-3">
<?php foreach($articles as $article) : ?>
<article class="relative isolate flex flex-col justify-end overflow-hidden rounded-2xl bg-neutral-900 px-8 pb-8 pt-80 sm:pt-48 lg:pt-80">
<img
class="absolute inset-0 -z-10 h-full w-full object-cover"
src="<?php echo esc_url($article['featured_image'], 'https'); ?>"
alt="Image for <?php echo ($title = esc_attr(echo $article['title'])); ?>"
/>
<div class="absolute inset-0 -z-10 bg-gradient-to-t from-neutral-900 via-neutral-900/40"></div>
<div class="absolute inset-0 -z-10 rounded-2xl ring-1 ring-inset ring-neutral-300/10"></div>
<div class="flex flex-wrap items-center gap-y-1 overflow-hidden text-sm leading-6 text-neutral-300">
<time datetime="<?php echo esc_attr($article['published']['timestamp']); ?>" class="sr-only">
<?php echo esc_attr($article['published']['for_humans']); ?>
</time>
</div>
<h3 class="mt-4 text-lg font-semibold leading-6 text-white">
<a href="<?php echo esc_url($article['link'], 'https'); ?>" target="_blank">
<span class="absolute inset-0"></span>
<?php echo title; ?>
</a>
</h3>
</article>
<?php endforeach; ?>
</div>

And that’s it, you’re all done for article one and your Medium articles should be showing on your WordPress website.

Next up, we’ll look at how this process can be less reliant on Medium in real-time. Let’s say there’s a status issue and Medium’s inaccessible for some reason — you don’t want part of your website to break or appear blank.

In our second article, we’ll move this logic from a just-in-time call to a scheduled event, which will save data to a custom post type and persist key information to your local database; meaning you can retrieve your Medium articles using a more WordPress friendly WP_Query call instead of the bloated approach we’ve had to use above.

--

--

Ascend
Ascend

Written by Ascend

We're Ascend - a digital transformation agency - and in just a few short years we have defied expectations and are emerging as a true leader in our sector.

No responses yet