Day 9/100 100 Days of Code

Photo by RetroSupply on Unsplash

Day 9/100 100 Days of Code

Info Hunter

Problems, problems, and even more problems

Today I worked on ironing out the problems that have appeared. I implemented the same solution to another lxb_char_t html structure. I expected everything to start working properly but nothing did.

I had a hard time finding where the problem was. For some reason, it stopped filtering out the paragraphs that contained the keywords. After messing around with the debugger a bit, I found out that there was a problem with the collection structure which ended up empty.

During my research, I found out that the for loop which is used to process the website's text is completely skipped. This means that the text processing never gets executed.

So, tomorrow's task is to tackle that issue.

Here's the problematic code

// Get status of creating a collection creation for paragraph tags
status = lxb_dom_elements_by_tag_name(lxb_dom_interface_element(document->body),
                                     collection, (const lxb_char_t*) "p", 1);

if (status != LXB_STATUS_OK)
{
    exit(EXIT_FAILURE);
}

for (size_t i = 0; i < lxb_dom_collection_length(collection); i++)
    {
        // Get the text from any paragraphs
        // Special thanks to Alexander Lexborisov for providing me with the solution
        // to get the text. Source: https://github.com/lexbor/lexbor/issues/196
        elem = lxb_dom_collection_element(collection, i);
        node = lxb_dom_interface_node(elem)->first_child;

        if (node != nullptr && node->local_name == LXB_TAG__TEXT)
        {
            ch_data = lxb_dom_interface_character_data(node);
            const auto getData = ch_data->data.data;

            // To cast an unsigned char* to string, we need to use reinterpret_cast
            // This is not very safe but it works Source:
            // https://stackoverflow.com/questions/17746688/convert-unsigned-char-to-stdstring
            std::string toString(reinterpret_cast<char*>(getData));

            for (std::string& keyword : grabKeywords)
            {
                if (toString.find(keyword) != std::string::npos)
                {
                    keywordsFound = true;
                    break;
                }
            }

            //  Output the text
//              std::cout << (int) ch_data->data.length << (const char *) ch_data->data.data << std::endl;

            // Output
            if (keywordsFound)
            {
                std::cout << "Found: " << toString << std::endl;
                keywordsFound = false;
            }

        }
    }