Querying Microformats - further refinements

I added a few refinements to my microformats query code to make it more usable. The first was to address problems with children, html, and value. These aren't real properties, so I added the ability to address them as pseudo-properties. I also added the psuedo-property: 1. Append this to the end of a path to select the first element. Here's how you'd select the content of an article: "content/html/1".

The second change was to decouple the top-level h-* selector from mfpath(). This opens up the ability to traverse microformats in stages, such as when looping through replies to process them separately.

Finally, all strings (except html) are automatically html sanitized.

function mftype($parsed, $type) {
    return array_filter($parsed["items"], function($elt) use ($type) {
        return in_array($type, $elt["type"]);

function scrubstrings($arr) {
    return array_map(function($elt) {
        if (gettype($elt) == "string")
            return htmlspecialchars($elt);
        return $elt;
    }, $arr);

function mfprop($mfs, $prop) {
    $props = array();
    if ($prop == "1") {
        if (isset($mfs[0])) return $mfs[0];
        return "NO DATA";
    foreach ($mfs as $mf) {
        if (isset($mf["properties"][$prop]))
            $thisprops = scrubstrings($mf["properties"][$prop]);
        else if ($prop == "children" && isset($mf[$prop]))
            $thisprops = $mf[$prop];
        else if (($prop == "html") && isset($mf[$prop]))
            $thisprops = array($mf[$prop]);
        else if (($prop == "value") && isset($mf[$prop]))
            $thisprops = scrubstrings(array($mf[$prop]));
        $props = array_merge($props, $thisprops);
    return $props;

function mfpath($mf, $path) {
    $elts = array_filter(explode("/", $path), function($e){return $e!="";});
    return array_reduce($elts, function($result, $elt) {
        return mfprop($result, $elt);
    }, $mf);

The posts on this blog are stored as html files with microformats2 markup. I'm using the code below to process them, and the same functions should be usable for reading replies and reply-contexts off other sites.

function getPost($mf) {
    $e = mftype($mf, "h-entry");
    $post = array(
        "authorName" => mfpath($e, "author/name/1"),
        "authorPhoto" => mfpath($e, "author/photo/1"),
        "authorUrl" => mfpath($e, "author/url/1"),
        "name" => mfpath($e, "name/1"),
        "published" => mfpath($e, "published/1"),
        "contentHtml" => mfpath($e, "content/html/1"),
        "contentValue" => mfpath($e, "content/value/1"),
        "url" => mfpath($e, "url/1"),
    $post["type"] = ($post["name"] === $post["contentValue"]) 
        ? "note" : "article";
    return $post;

function getReplies($mf) {
    $replies = array();
    foreach (mfpath(mftype($mf, "h-entry"), "children") as $r) {
        $r = array($r);
        $replies[] = array(
            "authorName" => mfpath($r, "author/name/1"),
            "authorPhoto" => mfpath($r, "author/photo/1"),
            "authorUrl" => mfpath($r, "author/url/1"),
            "published" => mfpath($r, "published/1"),
            "contentValue" => mfpath($r, "content/value/1"),
            "url" => mfpath($r, "url/1"),
            "type" => count(mfpath($r, "author")) ? "reply" : "mention"
    return $replies;

Have you written a reply to this?