Yadda ɓoyayyun HTML entity ke aiki
HTML entity reference ne na harafi wanda mai binciken ke bincike ya koma harafi guda ɗaya. Haruffan HTML guda biyar masu ajiye (< > & " ') koyaushe suna buƙatar ɓoyewa idan ana nuna rubutu a matsayin HTML; duk sauran zaɓi ne kuma sun dogara da ɓoyayyun takarda.
- Zaɓi yanayi da kewayen. Yanayin ɓoyewa yana tafiya shigarwarku harafi biye da harafi. Yanayin buɗe ɓoye yana tafiya shigarwa yana neman tsarukan entity. Maɓallin kewaye yana yanke shawara ko haruffan HTML-safe guda biyar kawai aka ɓoye, ko ko kowane code point marasa ASCII ana sake rubuta shi kuma.
- Zaɓi salon entity. Entities masu suna (
©) suna karatu sosai a cikin tushe. References decimal (©) da hex (©) suna ɗaukar kowane code point na Unicode ba tare da buƙatar suna ba. Tsofaffin abokin imel da masu binciken XML sun fi son nau'ukan numeric. - Tafiya cikin shigarwa. A kan ɓoyewa, muna karanta kowane code point kuma muna duba shi da tebur na cikin gida na kusan entities 200 masu suna na yau da kullun. Waɗanda ba a samu ba suna koma zuwa numeric. A kan buɗe ɓoye, muna bincika da regex guda ɗaya da ke daidaita
&name;,&#NNN;, da&#xHH;a cikin wucewa guda ɗaya. - Auna zuwa haruffa. Daidaitun masu suna suna warwarewa ta tebur ɗin juye. Daidaitun numeric suna tafiya ta
String.fromCodePointtare da tushe 10 ko tushe 16. Entities masu suna waɗanda ba a sani ba ana barin su ba tare da taɓa ba don shigarwa ta wucewa ta zagaye ba tare da asara ba. - Yanayin live. Kunna yanayin live kuma kowane danna key yana sake gudanar da juyarwa tare da debounce na 150 ms. Yana da amfani lokacin da kuke gyara snippet kuma kuna son amsa nan da nan kafin lallaba shi a cikin samfuri.
Me ya sa a ɓoye HTML entities
- Hana shigarwar masu amfani daga karya tsari. Lokacin da mai amfani ya buga
<da ya ɓace cikin akwatin tsokaci, saka wannan rubutu kai tsaye a cikin HTML yana sake rubuta sauran shafin. Ɓoye haruffan masu ajiye da farko yana nufin mai binciken yana nuna harafi maimakon bincika shi a matsayin farkon tag. - Kiyaye ƙimomin attribute su zama ingantacce. Saka string da aka naɗa a cikin HTML attribute yana buƙatar maye gurbin naɗaɗɗen quote da
"(don attrs da aka naɗa da double-quote) ko'(don single-quoted). In ba haka ba mai bincike yana rufe attribute da wuri kuma sauran layi ya zama markup da ya ɓace. - Kashe HTML ta hatsari a cikin bayanai da aka ajiye. Logs, rahotannin kuskure, da fitar da zungur galibi suna ƙunshe da angle brackets na ainihin da ampersands. Ɓoye-entity ɗin ajiyewa kafin lallaba shi a cikin shafin takardu yana ajiye wannan kwafi mai ganuwa a matsayin rubutu maimakon kunna renderer ko mai gano haɗi atomatik.
- Raba snippets na code lafiya. Saka misali tag kamar
<script>alert(1)</script>a cikin shafin blog, imel, ko saƙon Slack yana buƙatar ɓoye brackets don snippet ya nuna maimakon gudu. Wannan fasaha ɗaya ya shafi jikin feed na RSS da filayen `description` na JSON-LD.
Amfani na yau da kullun
Ɓoyayyun entity yana bayyana ko'ina rubutu mai tsatsa aka ƙulla zuwa HTML a lokacin aiki — har ma lokacin da framework galibi yana sarrafa ta a gare ku, kayan aikin hannu yana da amfani don lokutan da ba ta yi haka ba.
- Samfuran da aka nuna ta uwar garken: Jinja2, ERB, Twig, da Handlebars suna auto-escape ta tsoho, amma blocks na raw da alamu na `safe` suna kashe hakan — codec yana ba ku tabbatar da abin da escape zai samar da shi.
- Ƙirƙirar imel da newsletter: yawancin engines na samfuri na ESP ba sa auto-escape filayen haɗuwa, don haka smart quotes da haruffan copyright a cikin sunaye da masu amfani suka bayar suna buƙatar pre-encoding.
- Takardu da samfurin code: lallaba misali tag na HTML a cikin shafin blog na Markdown ko snippet na gidan yanar gizo mai tsaye yana buƙatar ɓoye brackets don renderer ya yi la'akari da shi a matsayin rubutu mai ganuwa.
Misali da aka yi aiki
Lallaba <script>alert('hi')</script> a cikin shigarwa tare da yanayin da aka saita zuwa Ɓoye, salon Mai Suna, kewayenMafi ƙaranci. Fitowar tana karanta <script>alert('hi')</script>. Canza salon zuwa Numeric hex kuma shigarwa ɗaya tana samar da <script>alert('hi')</script>. Juya yanayin zuwa Buɗe Ɓoye, lallaba string da aka ɓoye baya a ciki, kuma tag na asali ya dawo cikakke.
FAQ
Menene HTML entities?
HTML entities references ne na haruffa da mai binciken ke maye gurbinsu zuwa haruffa guda ɗaya lokacin da yake bincika shafin. Suna zuwa a cikin nau'ukan uku: masu suna (kamar & don &), decimal numeric (&), da hex numeric (&). Haruffan HTML guda biyar masu ajiye (<, >, &, ", ') suna buƙatar ɓoyewa ko'ina rubutu aka jefa a cikin HTML. Sauran kusan entities 2,225 masu suna suna rufe alamu, lafazi, da haruffan Giriki amma zaɓi ne da zarar ɓoyayyun takarda ya zama UTF-8.
Yaushe kamata a yi amfani da entities masu suna vs numeric?
Yi amfani da entities masu suna lokacin da kuke son tushen ya karanta cikin tsabta (ɗan adam da ke duba © a cikin samfuri ya gane shi nan da nan). Yi amfani da numeric (decimal ko hex) lokacin da mai amfani ya tsufa ko ya fi tsauri — masu binciken XML, tsofaffin abokin imel, da wasu masu karatan feed suna gane wani ƙaramin rukunin entities masu suna na HTML5 kawai, kuma duk suna gane nau'ukan numeric. Hex galibi yana cin nasara a yanayin da aka mai da hankali kan tsaro saboda yana layi ɗaya ga notation na code-point na Unicode da aka yi amfani da shi a cikin takardu na ƙa'idodin.
Shin buɗe ɓoye yana sarrafa hex entities kamar &?
Eh. Mai buɗe ɓoye yana amfani da regex guda ɗaya da ke daidaita nau'ukan entity uku duka a cikin wucewa guda ɗaya: &name;, &#NNN;, da &#xHH;. Ana warware daidaitun numeric da String.fromCodePoint ta amfani da tushe 10 ko tushe 16. Shigarwa ta cakuɗe (mai suna da numeric a cikin string ɗaya) tana buɗe ɓoye daidai, kuma sunaye waɗanda ba a sani ba ana barin su a matsayin rubutu na harafi don shigarwa ta wucewa ta zagaye ba tare da asara ba.
Shin wannan aminci ne don amfani da shigarwa marasa amana?
Codec ɗin kansa yana cikin mai binciken kawai kuma ba ya aika shigarwarku ko'ina. Ko fitowar ta aminci don saka ta dogara da yanayin. Ɓoyayyun entity tana sarrafa jikin HTML da yanayin ƙimar attribute, wanda ya rufe Doka ta #1 na OWASP. Yanayin JavaScript (masu sarrafa abubuwan tare da layi, blocks na `<script>`), yanayin CSS, da yanayin URL kowannensu yana buƙatar ƙa'idodinsa na ɓoyewa — ɓoyayyun entity kawai ba ta ishe a can. Don tsaro mai zurfi ta gefen uwar garken, haɗa wannan da engine na samfuri mai sani da yanayi kamar DOMPurify ko auto-escape na framework ɗinka.
Ɓoyayyun entity da ke gefen mai binciken yana zaune a iyakar tsakanin shigarwar masu amfani da HTML da aka nuna. Yin juyarwa gida yana nufin kuna iya duba abin da framework ɗinku zai samar da shi, ba tare da taɓa aika rubutun asali zuwa kayan aiki na ɓangare na uku ba.