Swift

μœ λ‹ˆμ½”λ“œ , μ•„μŠ€ν‚€μ½”λ“œ ( Unicode , ASCII )

κΈ°κ°€μ •ν›ˆ 2022. 7. 25. 20:00

μ•ˆλ…•ν•˜μ„Έμš”! μ˜€λŠ˜μ€ λ¬Έμžμ—΄κ³Ό 문자 κΈ€μ—μ„œ μž‘μ„±μ„ λ―Έλ€„λ‘μ—ˆλ˜ μœ λ‹ˆμ½”λ“œμ— κ΄€λ ¨ν•˜μ—¬ 정리λ₯Ό 해보렀고 ν•©λ‹ˆλ‹€.

μΆ”κ°€μ μœΌλ‘œ μ•„μŠ€ν‚€μ½”λ“œκΉŒμ§€λ„ μ—΄μ‹¬νžˆ μ •λ¦¬ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€!! 🀜

 

πŸ“šμ°Έκ³ μžλ£Œ 및 링크

 

Strings and Characters — The Swift Programming Language (Swift 5.7)

Strings and Characters A string is a series of characters, such as "hello, world" or "albatross". Swift strings are represented by the String type. The contents of a String can be accessed in various ways, including as a collection of Character values. Swi

docs.swift.org

 


 

μœ λ‹ˆμ½”λ“œμ™€ μ•„μŠ€ν‚€μ½”λ“œλ₯Ό μ •λ¦¬ν•˜κΈ° μ•žμ„œ 2진법에 λŒ€ν•΄μ„œ μ•Œμ•„λ³΄κ² μŠ΅λ‹ˆλ‹€.

2진법은 μ˜€μ§ 0κ³Ό 1둜만 데이터λ₯Ό ν‘œν˜„ν•©λ‹ˆλ‹€.

μ»΄ν“¨ν„°λŠ” 정보λ₯Ό ν‘œν˜„ν•˜κΈ° μœ„ν•œ κΈ°λ³Έ μ›λ¦¬λ‘œλŠ” μ „κΈ° μ‹ ν˜Έμž…λ‹ˆλ‹€. μ»΄ν“¨ν„°μ—λŠ” ꡉμž₯히 λ§Žμ€ μŠ€μœ„μΉ˜ ( νŠΈλžœμ§€μŠ€ν„° )κ°€ μžˆμŠ΅λ‹ˆλ‹€.

νŠΈλžœμ§€μŠ€ν„°λ₯Ό μ‚¬μš©ν•˜μ—¬ μ „κΈ° μ‹ ν˜Έκ°€ 있으면 '1' μ—†μœΌλ©΄ '0' 두 가지 경우둜 ν‘œν˜„ν•©λ‹ˆλ‹€.

μ»΄ν“¨ν„°λŠ” 2μ§„λ²•μ—μ„œ ν•˜λ‚˜μ˜ 자릿수λ₯Ό ν‘œν˜„ν•˜λŠ” λ‹¨μœ„λ₯Ό λΉ„νŠΈ ( bit )라고 ν•©λ‹ˆλ‹€.

λΉ„νŠΈ ν•œ κ°œλ‘œλŠ” λ§Žμ€ μ–‘μ˜ 데이터λ₯Ό λ‚˜νƒ€λ‚΄κΈ°μ— λΆ€μ‘±ν•˜μ—¬ μ—¬λŸ¬ 숫자 쑰합을 컴퓨터에 λ‚˜νƒ€λ‚΄κΈ° μœ„ν•΄ λΉ„νŠΈμ—΄μ„ μ‚¬μš©ν•©λ‹ˆλ‹€.

λ°”μ΄νŠΈ( byte )라고 λΆˆλ¦½λ‹ˆλ‹€. λ°”μ΄νŠΈλŠ” 8개의 λΉ„νŠΈκ°€ λͺ¨μ—¬ λ§Œλ“€μ–΄μ§„ κ²ƒμž…λ‹ˆλ‹€.

λΉ„νŠΈλŠ” ν•˜λ‚˜μ˜ 0κ³Ό 1둜 ν‘œν˜„ν•  수 있기 λ•Œλ¬Έμ— 2⁸ = 256개의 μ„œλ‘œ λ‹€λ₯Έ λ°”μ΄νŠΈκ°€ μ‘΄μž¬ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

 

 

문자λ₯Ό 숫자둜 ν‘œν˜„ν•  수 μžˆλ„λ‘ 정해진 약속(ν‘œμ€€)이 μžˆμŠ΅λ‹ˆλ‹€!!!

그쀑 ν•˜λ‚˜λŠ” λ―Έκ΅­ μ •λ³΄κ΅ν™˜ ν‘œμ€€ λΆ€ν˜ΈμΈ ASCII ( μ•„μŠ€ν‚€μ½”λ“œ / American Standard Code for Information Interchange )

 

μ•„μŠ€ν‚€μ½”λ“œ ( ASCII )

총 128개의 λΆ€ν˜Έλ‘œ μ •μ˜ λ˜μ–΄μžˆμŠ΅λ‹ˆλ‹€. μ•ŒνŒŒλ²³ AλŠ” 10μ§„μˆ˜ κΈ°μ€€μœΌλ‘œ 65으둜 λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.

Aλ₯Ό 2μ§„λ²•μœΌλ‘œ ν‘œν˜„ν•˜λ©΄ 10진법 κΈ°μ€€μœΌλ‘œ 65μ΄λ―€λ‘œ πŸβΆ × 1 + 𝟐⁡ × 0 + 𝟐⁴ × 0 + 𝟐³ × 0 +𝟐² × 0 + 𝟐¹ × 0 + 𝟐⁰ × 1 =  ( 64 + 1 )둜 ν‘œν˜„ν•  수 μžˆμŠ΅λ‹ˆλ‹€. Aλ₯Ό 2μ§„λ²•μœΌλ‘œ ν‘œν˜„ν•˜λ©΄ 1000001μž…λ‹ˆλ‹€.

ν•˜μ§€λ§Œ μ•„μŠ€ν‚€μ½”λ“œλŠ” 8λΉ„νŠΈμ˜ 데이터λ₯Ό μ‚¬μš©ν•œλ‹€κ³  ν–ˆλŠ”λ° κ³„μ‚°ν•œ κ°’κ³Ό μ‚¬μ§„μ˜ μˆ˜λŠ” 7μžλ¦¬μž…λ‹ˆλ‹€. πŸ€”

λ‚˜λ¨Έμ§€ ν•œ μžλ¦¬λŠ” νŒ¨λ¦¬ν‹° λΉ„νŠΈ ( parity bit )라고 λΆˆλ¦½λ‹ˆλ‹€. λ°μ΄ν„°μ˜ μ—λŸ¬λ₯Ό νƒμ§€ν•˜κΈ° μœ„ν•΄ μ‚¬μš©λ©λ‹ˆλ‹€.

더보기

νŒ¨λ¦¬ν‹° λΉ„νŠΈ ( parity bit ) : 정보전달 κ³Όμ •μ—μ„œ 였λ₯˜ λ°œμƒ μ—¬λΆ€λ₯Ό κ²€μ‚¬ν•˜κΈ° μœ„ν•΄ μΆ”κ°€λœ λΉ„νŠΈμ΄λ©° μ „μ†‘ν•˜κ³ μž ν•˜λŠ” λ°μ΄ν„°μ˜ 끝에 1λΉ„νŠΈλ₯Ό μΆ”κ°€ν•˜μ—¬ μ „μ†‘ν•©λ‹ˆλ‹€. 

 

패리트 λΉ„νŠΈλŠ” ν™€μˆ˜ , 짝수 λΉ„νŠΈλ‘œ λΆ„λ₯˜ν•©λ‹ˆλ‹€.

μ „μ†‘ν•˜κ³ μž ν•˜λŠ” Data의 1의 κ°œμˆ˜κ°€ 짝수면 0  1의 κ°œμˆ˜κ°€ ν™€μˆ˜λ©΄ 1둜 λ³΄λƒ…λ‹ˆλ‹€.

μ΄λŸ¬ν•œ νŒ¨λ¦¬ν‹° λΉ„νŠΈλ‘œλŠ” 였λ₯˜ κ²€μΆœλ§Œ κ°€λŠ₯ν•  뿐 였λ₯˜λ₯Ό μˆ˜μ •ν•  수 μ—†λ‹€λŠ” 단점이 μžˆμŠ΅λ‹ˆλ‹€.

잘λͺ»λœ 곳을 μ°ΎκΈ° μœ„ν•΄μ„œλŠ” 해밍 μ½”λ“œλ₯Ό ν™œμš©ν•΄μ•Ό λ©λ‹ˆλ‹€.

해밍 μ½”λ“œ κ΄€λ ¨ν•΄μ„œλŠ” 천천히 CS지식 정리 μΉ΄ν…Œκ³ λ¦¬μ— 글을 μž‘μ„±ν•΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€!!!πŸ“š

 

μ•„μŠ€ν‚€μ½”λ“œμ˜ 문제점

μ•„μŠ€ν‚€μ½”λ“œ ν‘œμ—λŠ” μ˜μ–΄λ§Œ μ‘΄μž¬ν•©λ‹ˆλ‹€. μ•„μŠ€ν‚€μ½”λ“œν‘œκ°€ λ§Œλ“€μ–΄μ§ˆ μ‹œκΈ°μ—λŠ” μ˜μ–΄κΆŒ μ‚¬λžŒλ“€λ§Œ 주둜 컴퓨터λ₯Ό μ‚¬μš©ν•˜μ—¬ μ‹œκ°„μ΄ μ§€λ‚˜λ©΄μ„œ λ‹€μ–‘ν•œ μ—¬λŸ¬ λ‚˜λΌμ—μ„œ 컴퓨터λ₯Ό μ‚¬μš©ν•˜κ²Œ λ˜μ–΄ λ§Žμ€ ν‘œν˜„μ„ ν•  수 μžˆλ„λ‘ λ‹€μ–‘ν•œ ν‘œμ€€λ“€μ΄ μƒκΈ°κ²Œ λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

더보기

β–ΆοΈŽ 이 λ§Žμ€ ν‘œμ€€λ“€μ„ ν•˜λ‚˜λ‘œ ν•©μΉ  μƒˆλ‘œμš΄ 기쀀이 ν•„μš”ν•΄μ Έ μœ λ‹ˆμ½”λ“œκ°€ λ“±μž₯ν•˜κ²Œ λ˜μ—ˆμŠ΅λ‹ˆλ‹€.

 

μœ λ‹ˆμ½”λ“œ ( Unicode )

μ „ μ„Έκ³„μ˜ μ–Έμ–΄μ˜ 문자λ₯Ό μ •μ˜ν•˜κΈ° μœ„ν•œ ꡭ제 ν‘œμ€€ μ½”λ“œκ°€ λ“±μž₯ν•˜μ˜€μŠ΅λ‹ˆλ‹€. 1λ°”μ΄νŠΈλ‘œ λΆ€μ‘±ν•˜μ—¬ ν™•μž₯ν•œ 2λ°”μ΄νŠΈ 𝟐¹βΆ = 65,536λ₯Ό μ‚¬μš©ν•˜κ²Œ λ©λ‹ˆλ‹€. μœ λ‹ˆμ½”λ“œμ—μ„œλ„ 문제점이 λ°œμƒν•©λ‹ˆλ‹€. μ˜μ–΄λ‘œ ν‘œν˜„ν•  땐 1λ°”μ΄νŠΈ ν•œκΈ€ ν‘œν˜„ μ‹œ 2λ°”μ΄νŠΈ λ‹€λ₯Έ 특수문자 ν‘œν˜„ μ‹œ 3λ°”μ΄νŠΈ λ“± λ¬Έμžλ§ˆλ‹€ 가변적인 ν‘œν˜„μ˜ λ¬Έμ œκ°€ μƒκΉλ‹ˆλ‹€. λ¬Έμžλ§ˆλ‹€ ν‘œν˜„μ˜ 문제둜 인해 컴퓨터가 ν˜Όλž€μ„ μ£Όμ–΄ μ–΄λ–€ 문자둜 읽을지 μ •ν•΄μ€˜μ•Ό ν•©λ‹ˆλ‹€.

더보기

μœ λ‹ˆμ½”λ“œ 인코딩 방식 λ“±λ“± μš”λ‘  정보듀은 λ”°λ‘œ κΈ€λ‘œ 정리할 μ˜ˆμ •μž…λ‹ˆλ‹€!!

 

λ¬Έμžμ—΄κ³Ό λ¬Έμžμ—μ„œ μ‚¬μš©λ˜λŠ” μœ λ‹ˆμ½”λ“œμ— λŒ€ν•΄ 정리λ₯Ό 본격적으둜 μ‹œμž‘ν•΄λ³Όκ²Œμš”γ…Žγ…Ž

 

λ¬Έμžμ—΄ λ¦¬ν„°λŸ΄μ˜ 특수문자

λ¬Έμžμ—΄ λ¦¬ν„°λŸ΄μ€ 특수 문자λ₯Ό 포함할 수 μžˆμŠ΅λ‹ˆλ‹€.

let wiseWords = "\"Imagination is more important than knowledge\" - Einstein"
// "Imagination is more important than knowlege" - Einstein
let dollaSign = "\u{24}"            // $, μœ λ‹ˆμ½”νŠΈ U+0024
let blackHeart = "\u{2665}"         // ♥, μœ λ‹ˆμ½”λ“œ U+2665
let sparklingHeart = "\u{1F496}" // πŸ’–,μœ λ‹ˆμ½”λ“œ U+1F496

\u { n } : n은 1-8자리 16μ§„μˆ˜ ν˜•νƒœλ‘œ κ΅¬μ„±λœ μœ λ‹ˆμ½”λ“œμž…λ‹ˆλ‹€.

 

μœ λ‹ˆμ½”λ“œ 슀칼라

Swift의 λ„€μ΄ν‹°λΈŒ λ¬Έμžμ—΄ νƒ€μž…μ€ μœ λ‹ˆμ½”λ“œ 슀칼라 κ°’μœΌλ‘œ λ§Œλ“€μ–΄μ‘ŒμŠ΅λ‹ˆλ‹€. 크기가 가변적인 String λ¬Έμžμ—΄μ„ ν•˜λ‚˜μ”© κ°œλ³„μ μœΌλ‘œ μ ‘κ·Όν•˜κΈ° μœ„ν•œ λ°©λ²•μž…λ‹ˆλ‹€. μœ λ‹ˆμ½”λ“œλŠ” 21λΉ„νŠΈμ˜ 숫자둜 κ΅¬μ„±λ˜μ–΄μžˆμœΌλ©° ν•˜λ‚˜ μ΄μƒμ˜ μœ λ‹ˆμ½”λ“œ μŠ€μΉΌλΌκ°€ λͺ¨μ—¬ Characterλ₯Ό λ§Œλ“­λ‹ˆλ‹€.  

 

ex) U+0061λŠ” λΌν‹΄μ–΄μ˜ μ†Œλ¬Έμž aλ₯Ό λ‚˜νƒ€λ‚΄κ³  U+1F425λŠ” μ •λ©΄μ˜ 병아리λ₯Ό

 

μœ λ‹ˆμ½”λ“œλ₯Ό κ²°ν•©ν•΄μ„œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

  let precomposed: Character = "\u{D55C}"                        // ν•œ
  let decomposed: Character = "\u{1112}\u{u1161}\u{11AB}"    // γ…Ž, ㅏ,γ„΄
  // precomposed : ν•œ, decomposed ν•œ

 

μœ λ‹ˆμ½”λ“œ λ¬Έμžμ—΄κ³Ό 문자 비ꡐ

μœ λ‹ˆμ½”λ“œλŠ” κ²°ν•©λœ λ¬Έμžμ—΄μ„ κ°–κ³  λΉ„κ΅ν•˜κ²Œ λ©λ‹ˆλ‹€.

// "Voulez-vous un café?" using LATIN SMALL LETTER E WITH ACUTE
let eAcuteQuestion = "Voulez-vous un caf\u{E9}?"

// "Voulez-vous un café?" using LATIN SMALL LETTER E and COMBINING ACUTE ACCENT
let combinedEAcuteQuestion = "Voulez-vous un caf\u{65}\u{301}?"

if eAcuteQuestion == combinedEAcuteQuestion {
    print("These two strings are considered equal")
}
// These two strings are considered equal 좜λ ₯

같은 μœ λ‹ˆμ½”λ“œ λ¬Έμžμ—¬λ„ μœ λ‹ˆμ½”λ“œκ°€ λ‹€λ₯΄λ©΄ λ‹€λ₯Έ 문자둜 νŒλ³„ν•©λ‹ˆλ‹€.

let latinCapitalLetterA: Character = "\u{41}"

let cyrillicCapitalLetterA: Character = "\u{0410}"

if latinCapitalLetterA != cyrillicCapitalLetterA {
    print("These two characters are not equivalent.")
}
// Prints "These two characters are not equivalent."
더보기

β€»  Swiftμ—μ„œ λ¬Έμžμ—΄κ³Ό 문자의 λΉ„κ΅λŠ” μ–Έμ–΄λ₯Ό κ³ λ €ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. 언어와 상관없이 같은 문자면 같은 문자둜 μ·¨κΈ‰ν•©λ‹ˆλ‹€.

 

λ¬Έμžμ—΄μ˜ μœ λ‹ˆμ½”λ“œ ν‘œν˜„

μœ λ‹ˆμ½”λ“œ λ¬Έμžκ°€ ν…μŠ€νŠΈ νŒŒμΌμ΄λ‚˜ λ‹€λ₯Έ μ €μž₯μ†Œμ— 쓰일 λ•Œ μœ λ‹ˆμ½”λ“œ μŠ€μΉΌλΌλŠ” UTF-8 , UTF-16, UTF-32λ“± λ‹€μ–‘ν•œ μœ λ‹ˆμ½”λ“œ 인코딩 방식이 μ‚¬μš©λ©λ‹ˆλ‹€.

let dogString = "Dog!!🐢"

 

UTF - 8 ν‘œν˜„

for codeUnit in dogString.utf8 {
    print("\(codeUnit) ", terminator: "")
}
print("")
// 68 111 103 226 128 188 240 159 144 182

 

UTF - 16 ν‘œν˜„

for codeUnit in dogString.utf16 {
    print("\(codeUnit) ", terminator: "")
}
print("")
// 68 111 103 8252 55357 56374

 

μœ λ‹ˆμ½”λ“œ 슀칼라 ν‘œν˜„

for scalar in dogString.unicodeScalars {
    print("\(scalar.value) ", terminator: "")
}
print("")
// 68 111 103 8252 128054



for scalar in dogString.unicodeScalars {
    print("\(scalar) ")
}
// D
// o
// g
// !!
// 🐢

 


 

μœ λ‹ˆμ½”λ“œμ™€ μ•„μŠ€ν‚€μ½”λ“œμ— κ΄€λ ¨ν•˜μ—¬ μ •λ¦¬ν•˜λ©΄μ„œ λͺ°λžλ˜ 뢀뢄에 λŒ€ν•΄μ„œ 정리할 μ‹œκ°„μ΄ λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 

뿐만 μ•„λ‹ˆλΌ λ‹€λ₯Έ CS지식듀을 λ‹€λ₯Έ μΉ΄ν…Œκ³ λ¦¬λ‘œ λ¬Άμ–΄ ν•˜λ‚˜μ”© 정리해볼 μƒκ°μž…λ‹ˆλ‹€. 

 

↓ μ •λ¦¬ν•˜λ©΄μ„œ λͺ¨λ₯΄λŠ” 정보듀을 μ°Ύμ•„λ΄€λ˜ λΈ”λ‘œκ·Έλ“€μž…λ‹ˆλ‹€!!

더보기
https://velog.io/@kim-jaemin420/ASCII-Unicode-%EC%95%84%EC%8A%A4%ED%82%A4%EC%BD%94%EB%93%9C%EC%99%80-%EC%9C%A0%EB%8B%88%EC%BD%94%EB%93%9C

https://m.blog.naver.com/PostView.naver?isHttpsRedirect=true&blogId=ansdbtls4067&logNo=220886661657

https://devuna.tistory.com/31

https://velog.io/@haze5959/Swift-Unicode-Scalar-%EA%B7%B8%EB%A6%AC%EA%B3%A0-%EB%AC%B8%EC%9E%90%EC%97%B4-count-%EC%8B%9C%EA%B0%84-%EB%B3%B5%EC%9E%A1%EB%8F%84-%EA%B4%80%EA%B3%84