Abstract
Aim
The internet is a key resource for information on medical conditions. Such information should be clear, high-quality, and comprehensive. We aimed to evaluate the quality, reliability, readability, recency, popularity, and comprehensiveness of online pemphigus information and examine how these factors are influenced by the producers of websites.
Materials and Methods
We searched for “pemphigus” on Google, Yahoo, and Bing, including the top 50 results from each. The websites were categorized as websites for professionals, government websites, dermatology societies, non-profit organizations, and miscellaneous websites. We evaluated reliability using Journal of American Medical Association (JAMA) Benchmark Criteria and assessed quality using the DISCERN instrument. Readability was measured using the Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook Index, Gunning Fog Index (GFOG), Coleman-Liau Index, and Automated Readability Index. Popularity was based on SimilarWeb visit counts, and content was assessed using a 15-item checklist.
Results
Post-exclusion, the 35 websites had a mean JAMA score of 3.06 and a mean DISCERN score of 59.31, indicating good quality. The average reading-grade was 9.88, suggesting that approximately 10 years of education are required to understand the text. The mean FRES score was 46.03, indicating a college-level difficulty. The average comprehensiveness, based on a 15-item checklist, was 11.5. Follow-up visits were the least frequently mentioned topic (8.6%). Statistically significant differences were observed among the website groups in JAMA scores (p=0.009), FKGL (p=0.012), GFOG (p=0.008), popularity (p=0.002), and information on pemphigus epidemiology (p=0.021), types (p=0.014), differential diagnosis (p=0.008), and prognosis (p=0.023).
Conclusion
Although the reliability and quality of many websites were satisfactory, our study emphasizes the need for better readability in pemphigus resources. Dermatologists should help create clear and reliable online information to improve patient understanding and health outcomes.
INTRODUCTION
The Internet has become an increasingly vital resource for health information, with a growing number of individuals turning to it for advice on various medical conditions. In the United States, studies have indicated that approximately 70% of adult internet users have sought health-related information online.1 This trend highlights the reliance on search engines as primary tools for accessing health information, often leading patients to a myriad of websites that compete for their attention. However, this competition raises significant concerns regarding the quality and reliability of the information presented.2 Health-related websites provide a diverse array of content, ranging from highly reliable to potentially deceptive. However, many of these resources do not undergo peer-review and exhibit significant variations in quality.3 Furthermore, the readability of online health information often exceeds the recommended sixth-grade reading level set by organizations such as the National Institutes of Health (NIH) and the American Medical Association (AMA). This higher level can hinder the comprehension of the average reader, making it essential to ensure that online health materials are evaluated appropriately before use.4
The literature includes studies that assess the quality, readability, and comprehensiveness of online health information for a range of dermatologic conditions, such as hidradenitis suppurativa, acral lentiginous melanoma, rosacea, psoriasis, generalized pustular psoriasis, vitiligo, Behçet’s disease, laser tattoo removal, and oral leukoplakia.5-13Moreover, a study has examined how large language models can aid in creating patient education materials that are easier to read and understand.14
Pemphigus, a group of life-threatening autoimmune bullous diseases characterized by flaccid blisters and erosions of the mucous membranes and skin,15 is one condition for which patients may seek information online. Previous studies only assessed the readability of online information sources related to pemphigus vulgaris.16 In this study, we evaluated the readability, quality, reliability, recency, and popularity of internet-based information on pemphigus. Additionally, we investigated the comprehensiveness of the content. We also examined the effects of the category of website producers on these factors.
MATERIALS AND METHODS
Internet Search Strategy
An internet search for the term “pemphigus” was conducted using three prominent search engines: Google, Yahoo, and Bing, on June 10, 2024. These search engines were selected based on their status as leading platforms in the UK as of March 2024, with the understanding that patients typically prefer general search tools over specialized medical databases, such as PubMed.17, 18 To ensure the integrity of the search results, geographical location settings were disabled, and browser data were cleared prior to conducting the search. This was performed using the private browsing mode to mitigate potential biases that could arise from the previous search history.
The search results were limited to the first 50 entries from each search engine, resulting in a total of 150 results. The following exclusion criteria were applied: repetitive websites, non-English content, websites lacking relevant information about pemphigus, sites requiring user registration or subscription, research articles, websites related to veterinary medicine, and websites that only provided video content. The process of website evaluation is illustrated in Figure 1.
During the evaluation of the websites, if relevant information could not be located on the homepage, the “three-click rule” was employed. This informal guideline suggests that users should be able to find the desired information in three mouse clicks. It is posited that if information is not accessible within this limit, users are likely to abandon the site.4
Website Typology
We categorized the websites into five distinct categories. Two independent authors classified the data, focusing on the ownership and type of the websites. The categories included: government websites (created and managed by official government agencies); dermatology societies’ official websites (e.g., British Association of Dermatologists, American Academy of Dermatology Association); non-profit organizations’ websites (charitable/supportive/educational websites created by non-profit organizations); miscellaneous websites (including sites that target the general population and do not fit into the other categories); and websites for professionals (containing detailed information primarily aimed at medical professionals). In instances where discrepancies arose between the authors’ classifications, a collaborative re-evaluation was conducted to reach a consensus on the final categorization of each website.
Assessment of Reliability and Quality
To evaluate the reliability of the websites, we used the Journal of the American Medical Association (JAMA) benchmark criteria. The JAMA benchmark criteria encompass four key components: 1) identification of authorship, 2) identification of sources, 3) specification of the date of creation or update, and 4) disclosures regarding ownership, advertising policy, sponsorship, and conflicts of interest. Each criterion was recorded as present or absent, with a scoring system awarding one point for each criterion met. The final score ranged from 0 to 4.2
The DISCERN score was used to assess the quality of the selected websites. This tool assesses website quality by grading 16 items on a five-point scale, where 1 indicates “not at all” and 5 indicates “completely.” The overall DISCERN score ranged from 16 to 80, with higher scores indicating higher quality information. Specifically, scores below 27 are categorized as “very poor”, 27-38 as “poor”, 39-50 as “fair”, 51-62 as “good”, and 63 and above as “excellent”.19 The final DISCERN score for each website was obtained by averaging the data from the two authors.
Assessment of Readability
Readability assessments of websites were conducted using automated tools available at “https://www.webfx.com/tools/read-able/”. The evaluation employed six established readability scales: Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), Simple Measure of Gobbledygook (SMOG), Gunning Fog Index (GFOG), Coleman-Liau Index (CLI), and the Automated Readability Index (ARI). The FRES was measured on a scale from 0 to 100, where a higher score indicates easier readability. Conversely, the FKGL, Gunning Fog Score, SMOG Index, CLI, and ARI provide educational grade levels that reflect the comprehension required for a given text. For optimal readability, the FRES should be ≥ 60, while the other five indices should yield scores of ≤ 6. Therefore, achieving higher FRES scores alongside lower scores in the other formulas indicates improved readability.20, 21
Assessment of Popularity
To evaluate the popularity of the websites included in this study, we used total visit counts over a three-month period obtained from SimilarWeb, a widely recognized web analytics service.22 The data were collected from March 2024 to May 2024. By incorporating these visit counts, we aimed not only to understand the quality of the information provided, but also to determine the extent of its dissemination and accessibility to the general public.
Assessment of Comprehensiveness
To evaluate the comprehensiveness of the websites, we established a checklist comprising 15 parameters. These parameters were in line with current clinical guidelines and relevant literature.15, 23, 24The checklist includes the following components: definition of pemphigus, epidemiology of the disease, types of pemphigus, pathophysiology, potential trigger factors and causes, symptoms associated with pemphigus, diagnostic evaluation methods, differential diagnoses, general management measures, treatment options, follow-up visit protocols, prognosis of the disease, complications related to pemphigus, references, and photographs as visual aids.
Statistical analysis
Statistical analyses were conducted using MATLAB R2024a (MathWorks Inc., Natick, MA, USA). Frequency data are presented as number (n) and percentage (%), whereas continuous data are expressed as the mean ± standard deviation. To assess statistical differences between groups, various statistical tests were employed. Chi-square tests were utilized for frequency variables, while the Kruskal-Wallis test was applied to analyze website readability indices, JAMA scores, DISCERN scores, popularity, and recency. For comparisons in which the Kruskal-Wallis test indicated significant differences, the post-hoc Dunn’s test was performed to identify specific group differences. Additionally, Spearman’s rank correlation analysis was performed to evaluate the correlations between JAMA and DISCERN scores, readability indices, recency, and popularity. A P value of <0.05 was considered statistically significant.
RESULTS
Website Typologies
A total of 150 websites were initially assessed, of which 115 were excluded based on predefined inclusion criteria, resulting in 35 websites that met the requirements for evaluation. Among these, 15 websites provided information under the title of “pemphigus”, while 17 specifically focused on “pemphigus vulgaris”, and 3 specifically focused on “pemphigus foliaceus”.
When analyzing the typologies of the 35 evaluated websites, we found that websites targeting professionals comprised 8 websites (23%). In contrast, websites aimed at the general population constituted the majority, accounting for 27 websites (77%). This category of general population websites includes government (n = 5, 14%), dermatology societies’ (n = 5, 14%), non-profit organizations’ (n = 4, 12%), and miscellaneous (n = 13, 37%) websites, as shown in Figure 2.
Comparison of Reliability, Quality, Readability, Popularity, and Recency Among Website Groups
The mean JAMA score for all websites (n = 35) was 3.06±0.97. There was a statistically significant difference in JAMA scores among the different website groups (P = 0.009) (Table 1). The post-hoc Dunn test revealed a significant difference between websites targeting professionals and government websites (P = 0.002).
The mean quality score of all websites, as measured by DISCERN, was 59.31±11.59. The websites targeting professionals had the highest quality score of 66.75±9.97, while government websites had the lowest quality score of 52.80±4.21. However, the analysis revealed no statistically significant difference in quality scores among the different types of websites (P = 0.198) (Table 1).
For readability, the mean FKGL for all websites (n = 35) was 8.64±1.94 years. There was a statistically significant difference in FKGL among the different website groups (P = 0.012). Websites targeting professionals had the highest mean FKGL, indicating more complex content than the other groups. The mean GFOG score for all websites was 10.22±2.45 years. There was a significant difference in GFOG among the website groups (P = 0.008). Post-hoc Dunn’s test showed that websites for professionals scored significantly higher than both government websites (P = 0.001) and miscellaneous websites (P = 0.005). The mean FRES for all websites was 46.03±13.83, indicating college-level difficulty, with no significant differences among the website categories (P = 0.164). Similarly, the mean SMOG Index was 7.46±1.29 years of education, the mean CLI was 15.57±2.49 years of education, and the mean ARI was 7.49±1.51 years of education, with no significant differences among the groups (P = 0.349, P = 0.207, and P = 0.088, respectively) (Table 1).
According to each index, the number of websites at or below a sixth-grade reading level is as follows: 3 for FKGL, 1 for GFOG, 2 for SMOG, 0 for CLI, and 7 for ARI. The average readability-grade for all websites was 9.876±3.61, which was calculated by averaging the FKGL, SMOG, GFOG, CLI, and ARI scores.
In terms of popularity, measured by total visits from March to May 2024, all websites had a mean of 173.640.084. There was a statistically significant difference in popularity across different website categories (P = 0.002) (Table 1). Post-hoc Dunn’s test indicated significant differences in website visits, showing that non-profit organizations’ (P = 0.002) and dermatology societies’ (P = 0.001) had fewer visits compared to miscellaneous websites, which had the highest mean visits.
The average recency (the time since the last update in months) for all websites is 24.42±21.99 months. However, when analyzed individually, 25 out of 35 websites were produced or updated within the last 2 years. There was no statistically significant difference in average recency among the groups (P = 0.959) (Table 1).
Correlation Analysis
Among the readability formulas tested, only the CLI showed a significant moderate positive correlation with the JAMA index (r = 0.352, P = 0.038). There were no significant correlations between the readability formulas and the DISCERN index. There was a significant strong positive correlation between JAMA and the DISCERN indices (r = 0.5069, P = 0.002). Additionally, there was a significant moderate positive correlation between JAMA and popularity (r = 0.384, P = 0.025). There were no significant correlations with recency for either index (Table 2).
Content Analysis
Based on the analysis of the websites using a 15-item checklist, certain statistically significant differences were observed in the presentation of information regarding pemphigus. The definition, pathophysiology, and symptoms of pemphigus were provided on all evaluated websites. Follow-up visits were the least mentioned topic, appearing in 8.6% of the websites. There were statistically significant differences in the inclusion of information on the epidemiology of pemphigus (P = 0.021), types of pemphigus (P = 0.014), differential diagnosis (P = 0.008), and prognosis (P = 0.023) among the groups. The detailed content analysis results are presented in Table 3.
In the analysis, the average total score for all 35 websites, based on a checklist where each item was assigned a score of 1 (with a minimum of 0 and a maximum of 15), was found to be 11.5 out of 15. The group-based average scores were as follows: miscellaneous websites scored 11.7; websites for professionals scored 13, government websites scored 9.2, dermatology societies’ websites scored 11, and non-profit organizations’ websites scored 11.2.
DISCUSSION
Pemphigus, a rare chronic autoimmune blistering disease, severely affects patients’ quality of life, especially in severe forms. Even in the early stages, the disease can significantly disrupt daily activities and overall well-being.25 Consequently, many individuals turn to the internet to seek information about their condition, explore treatment options, and find support. However, patients often lack the ability to assess the quality of online information, making it essential for physicians to guide patients toward trustworthy websites.26, 27 To date, no comprehensive study has evaluated the most prominent websites offering information about pemphigus.
In our review of pemphigus websites, we observed that 77% were aimed at the general public, highlighting a significant effort to spread awareness about this condition. The presence of government websites, dermatology societies, and non-profit organizations underscores the importance of credible sources for educating the public. The largest proportion of websites (37.14%) falls under the category of miscellaneous websites, which raises concerns regarding the accuracy and comprehensiveness of the information presented. This distribution underscores the importance of critically evaluating online resources to ensure that they provide high-quality and reliable information for all users.
We note that websites targeting professionals received the highest scores for the JAMA criteria, which is not surprising given their adherence to stricter standards. These websites are often authored by experts in the field, ensuring high-quality, evidence-based information. Healthcare providers and those seeking professional-level information should prioritize these resources to ensure their reliability. In line with this, our analysis also revealed that these professional targeted websites had the highest average DISCERN score, which indicates excellent quality. For the other groups, even the lowest average score fell within the “good quality” range. This suggests that the overall quality of resources across all groups remains relatively consistent, reassuring users that reliable information can be found regardless of category. Similarly, in a study comparing online resources for another autoimmune blistering disease, bullous pemphigoid, categorized by whether they were written by dermatologists or non-dermatologists, there was no significant difference in average DISCERN scores between the two groups.28
In recent years, the readability of online health information has emerged as a critical concern. The AMA and the NIH advocate for health care materials to be composed at or below a sixth-grade reading level to ensure comprehension among a diverse patient population.29, 30 However, our analysis of online resources related to pemphigus revealed that the average reading levels exceed this recommendation, indicating a pervasive issue of accessibility in health communication. Ji-Xu et al.16 reported that online patient education resources for pemphigus vulgaris and bullous pemphigoid are, on average, at least six reading grades above the recommended level, with materials authored by medical doctors, particularly dermatologists, being more complex than those written by non-medical professionals. This lack of readability is particularly detrimental to individuals with low health literacy who are already at an increased risk of misunderstanding their medical conditions and treatment options.31 Such misunderstandings can lead to delayed medical care, poor health outcomes, and decreased adherence to prescribed therapies. In addition, misinterpretation of medical information can exacerbate patient anxiety and stress. Our findings also highlighted that FKGL and Gunning Fog Score indicate that professional websites are significantly more difficult to read than other websites. The increased readability challenge is justified for several reasons. Professional content is tailored to individuals with specialized knowledge, requiring the use of advanced terminology and detailed information to meet the sophisticated needs of audiences. Furthermore, professionals generally possess higher education and experience, enabling them to grasp more complex material.
Skrzypczak et al.’s5 study on the readability of online documents about hidradenitis suppurativa evaluated 458 articles across 22 languages as non-profit, online shops, dermatology clinics, or pharmaceutical companies. The Lix score was used to assess readability, with most articles classified as very difficult to understand. Significant differences in readability were found across languages, but no notable differences were observed among the different origin categories.5
Jean-Pierre et al.12 assessed the readability and comprehensiveness of 77 websites on laser tattoo removal. They found that most sites were above the eighth-grade reading level, and less than half addressed pigmentary risks for darker skin or the need for consulting a board-certified dermatologist or plastic surgeon. More than 90% of the participants mentioned the need for multiple sessions. This study highlighted a gap in accessible, high-quality information for informed decision-making regarding laser tattoo removal.12
Malik et al.9 first evaluated the quality, comprehensiveness, and readability of online health information on generalized pustular psoriasis. An analysis of 500 websites with medical and layperson search terms revealed that only 16.8% were HONcode-accredited, and the mean DISCERN scores indicated notable gaps in reliability and treatment information. Additionally, only 4% of websites met the NIH-recommended sixth-grade reading level, with academic sites being harder to read than government sites, highlighting challenges for patients with low health literacy, who may already be at higher risk of not receiving timely medical care.9
Nayudu et al.10 assessed the quality and readability of online health information on phototherapy for vitiligo. An analysis of 500 websites with medical search terms revealed that 35% were HONcode-accredited, indicating reliability. The DISCERN scores highlighted gaps in reliability (58.9%) and treatment information (51.7%). Notably, none of the 130 websites assessed met the NIH-recommended sixth-grade reading level, indicating potential health disparities among patients with lower health literacy.10
Given that dermatologic patient education materials are often written above the recommended reading level, Lambert et al.14evaluated the use of large language models (ChatGPT-3.5, GPT-4, DermGPT, and DocsGPT) to generate patient education materials at specific, accessible reading levels. The FKGL of existing American Academy of Dermatology materials for common and rare conditions was assessed. The models were prompted to create handouts at fifth- and seventh-grade FKGLs, with GPT-4 performing best at the fifth-grade level for both common and rare conditions, while ChatGPT-3.5 and DocsGPT outperformed GPT-4 at the seventh-grade level for rare conditions. They concluded that large language models could enhance health literacy by providing accessible and understandable patient education materials in dermatology.14
In our study, websites in the miscellaneous category were significantly more popular than those of both non-profit organizations and dermatology societies. It is important to note that our analysis focused on general domains (e.g., https://patient.info) rather than specific pages (e.g., https://patient.info/doctor/pemphigus). This approach may not accurately reflect interest in specific topics but provides a broader perspective on overall website popularity.
Recent advancements in the treatment of pemphigus, particularly anti-CD20 therapy, have significantly improved treatment efficacy and reduced morbidity.24 Given that pemphigus is a disease group that requires ongoing research, maintaining updated information is crucial for effective management.32 Regarding content recency, a substantial proportion of the websites on pemphigus were updated within a reasonable timeframe, typically within the last two years. Furthermore, we observed no significant differences in the recency across the various website categories.
The CLI was positively correlated with JAMA, whereas other readability formulas did not show a significant relationship with JAMA. This may be due to these formulas evaluating different dimensions of text complexity. No significant correlations were observed between the readability formulas and the DISCERN index. Similarly, an analysis of online patient materials on dysplastic nevi found no correlation between DISCERN scores and readability metrics such as the Flesch Reading Ease and FKGL.33 This lack of correlation can be attributed to the inherent complexity of medical terminology and the limitations associated with these readability formulas.
The strong positive correlation between the JAMA and DISCERN indices suggests that both measures evaluate overlapping quality and reliability criteria. This finding aligns with previous research, such as a study on the quality of information on septic arthritis, which found a strong positive correlation between DISCERN scores and JAMA scores
(r = 0.877, P < 0.05).34 Additionally, the positive correlation between JAMA and popularity implies a tendency for reliable information sources to attract more attention, which is beneficial for public health.
The analysis revealed that although fundamental information regarding the definition, pathophysiology, and symptoms of pemphigus was universally covered, crucial aspects such as follow-up visits were notably underrepresented, appearing in only 8.6% of the websites. Proper follow-up is essential for managing chronic conditions like pemphigus, as it allows for the monitoring of disease progression, assessment of treatment efficacy, and timely management of any complications.15, 23Significant differences were noted in the availability of information on the epidemiology, types of pemphigus, differential diagnosis, and prognosis across the various website categories. These discrepancies indicate differences in the focus of the content, which are likely influenced by the intended audience and the expertise of the authors.
Study Limitations
This study has several limitations that should be acknowledged. First, there is no consensus on which readability index yields the most accurate results; therefore, we utilized various indices commonly referenced in the literature. Additionally, although the DISCERN instrument is well-established and evidence-based, it inherently involves a degree of subjectivity. Our search was restricted to English language materials, which limited our insight into the quality of patient education resources available in other languages. Furthermore, the content evaluation was based solely on whether specific topics were mentioned, without considering the depth or detail of the information provided. Lastly, given that the Internet is a rapidly changing medium, our current analysis represents a timely but limited snapshot of the available patient education materials, which may evolve significantly over time.
CONCLUSION
Our study highlights the critical need for improved readability of online resources related to pemphigus. Although the reliability and quality of the content on these websites were found to be satisfactory, the readability levels significantly exceeded the NIH’s grade six recommendation, potentially hindering patient comprehension. It is important for dermatologists to actively engage in the evaluation and endorsement of online information, ensuring that patients are directed toward reliable and comprehensible resources. By prioritizing readability in the development of online content, dermatologists can enhance patient understanding and improve health outcomes.