提问者:小点点

node.js:如何根据来自html字符串的数据创建特定对象的数组?


我是Node.js的乞丐,为了测试purpouse,我想创建一个简单的应用程序,它基于给定的HTML创建一个对象数组。

让我解释一下。我有一个HTML字符串,它包含多个div元素,如下所示:

<div class="user_container">
    <div class="user">
        <div class="thumb">
            <!--            thumbnail block-->
        </div>
        <div class="web_presence_locations"></div>

        <div class="user_data">
            <span class="name">Jaroslaw Chujczynski</span>
            <p class="location_with_flag">
                <!--                img with url here-->
                Leeds,
                United Kingdom
            </p>
            <div class="user_details">
                <div class="amount currency">
                    £28,000.00
                    <span class="overbooked">(in overfunding)</span>
                </div>
            </div>
        </div>
    </div>
    <div class="profile_container">
        <div class="extra_profile_data" style="">
            <div class="investments last">
                <h3 class="h5">Recent Investments</h3>
                <ul>
                    <li class="first">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test1">test1</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                    <li class="">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test2">test2</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                    <li class="">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test3">test3</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                    <li class="">
                        <div class="campaign-logo-frame">
                            <a class="campaign_link" href="/test4">test4</a>
                            <span class="currency">£28,000.00</span>
                        </div>
                    </li>
                </ul>
            </div>
        </div>
    </div>
</div>

我想要做的是基于我在上面的div中的数据创建一个对象,例如,它将如下所示:

{
name: 'Jaroslaw Chujczynski',
location: 'Leeds, United Kingdom',
ammountCurrency: '£28,000.00 (in overfunding)',
lastInvestments: [
 {
  name: 'test1',
  currency: '£28,000.00'
 }, {
  name: 'test2',
  currency: '£28,000.00'
 }, {
  name: 'test3',
  currency: '£28,000.00'
 }, {
  name: 'test4',
  currency: '£28,000.00'
 }]
}

当然,在我的html中会有很多这样的div,所以我将创建一个这样的对象数组。

好吧,我现在有的是:

const fs = require('fs');
const cheerio = require('cheerio');

const getAllData = (fileName) => {
    try {
        return  fs.readFileSync(fileName, 'utf8');
    } catch(e) {
        console.log('Error:', e.stack);
    }
}
const data = getAllData('test.html');
const $ = cheerio.load(data);

const filterData = () => {
    console.log($('div[class="user_container"]'));
}

filterData();

它给我的回报是这样的--那是不需要的(或者它必须是这样的?):

 namespace: 'http://www.w3.org/1999/xhtml',
    attribs: [Object: null prototype] {
      class: 'user_container'
    },
    'x-attribsNamespace': [Object: null prototype] {
      class: undefined
    },
    'x-attribsPrefix': [Object: null prototype] {
      class: undefined
    },
    children: [ [Node], [Node], [Node], [Node], [Node], [Node] ],
    parent: Node {
      type: 'tag',
      name: 'section',
      namespace: 'http://www.w3.org/1999/xhtml',
      attribs: [Object: null prototype],
      'x-attribsNamespace': [Object: null prototype],
      'x-attribsPrefix': [Object: null prototype],
      children: [Array],
      parent: [Node],
      prev: [Node],
      next: [Node]
    },
    etc....

所以我不确定,但我认为首先我必须获得一个div块的数组,其中class是user_container,当我获得它时,我必须迭代这个数组,为它们每个创建对象。

有人能帮我一下吗?


共1个答案

匿名用户

html是XML的一种类型--您应该查看XML工具--让这些工具解析html,然后您可以使用该工具对它们运行XML查询。这将允许您使用xtract XML,您可以将其转换为JSON。

快速的google搜索会返回以下用于nodejs的XML工具--但还有更多:

https://www.npmjs.com/package/fast-xml-parser-说它还将导出到JSON

http://www.curtismlarson.com/blog/2018/10/03/edit-xml-node-js/-有一个详细的walk thu.

相关问题