고속 역 제곱근 문서 원본 보기

{{위키데이터 속성 추적}}
[[파일:OpenArena-Rocket.jpg|오른쪽|섬네일|300x300px|위 사진에 나타난 [[1인칭 슈팅 게임]] ''OpenArena''의 [[조명]] 처리와 같은 연산에 [[입사각]]과 [[반사각]]을 계산할 때 고속 역 제곱근 알고리즘을 사용한다.]]

'''고속 역 제곱근'''(高速逆-根, fast inverse square root)은 때때로 '''Fast InvSqrt()'''나 [[16진수]] '''0x5f3759df'''라고도 하는, [[IEEE 754]] [[부동 소수점|부동소수점]] 체계의 [[32비트]] 실수에 대한 제곱근의 [[역수]]를 계산하기 위한 알고리즘이다. 이 알고리즘은 1990년대 초에 [[실리콘 그래픽스]]에서 개발한 것으로 추정되며, 1999년에 ''[[퀘이크 3 아레나]]''의 소스 코드에 쓰인 것이 가장 유명하다.    이 알고리즘에 관한 논의가 2000년에 중국어 개발자 포럼 CSDN에서 있었고,<ref name="csdn">{{웹 인용|url=http://bbs.csdn.net/topics/41888 |title=Discussion on CSDN |archive-url=https://web.archive.org/web/20150702180504/http://bbs.csdn.net/topics/41888 |archive-date=2015-07-02 |url-status=dead |df= }}</ref> 대중에 알려진 것은 2002년 또는 2003년에 [[유즈넷]]과 같은 공개 포럼에서였다.<ref name="Beyond3D">{{웹 인용|url=http://www.beyond3d.com/content/articles/8/|title=Origin of Quake3's Fast InvSqrt()|last=Sommefeldt|first=Rys|date=2006-11-29|work=Beyond3D|accessdate=2009-02-12}}</ref> 이 알고리즘은 [[컴퓨터 그래픽스]]에서 [[조명]]과 [[셰이딩]]을 위한 [[입사각]]과 [[반사각]] 계산에 사용되면서 대규모 부동소수점 연산의 [[알고리즘 분석|계산 비용]] 문제 해소에 도움이 되었다.

이 알고리즘은 실수를 입력 받아 나중에 사용할 절반의 값을 따로 저장한 다음, 입력 받은 float 실수를 long 비트로 취급한다. 한 비트를 오른쪽 [[논리 시프트]]한 결과가 [[매직 넘버]] [[십육진법|0x]]5f3759df에서 감산되며, 더 정확한 근사를 위해 이 근사치를 다시 float 실수로 취급하여 [[뉴턴의 방법]]을 한 번 사용한다. 이 알고리즘은 float 실수의 나눗셈을 사용하는 것보다 거의 네 배 더 빠르게 역 제곱근을 계산한다.

[[존 카맥]]은 ''퀘이크 3 아레나''에 사용된 소스 코드에 대해 [[이드 소프트웨어]]에서 ''[[퀘이크 (비디오 게임)|퀘이크]]''의 최적화를 도왔던 탁월한 어셈블리 프로그래머인 Terje Mathisen이 작성했을 것으로 보았다. 그러나 이러한 방법이 하드웨어 개발과 소프트웨어 개발 모두에서 뿌리 깊게 사용되고 있었던 것으로 나타났고, 알려진 최초의 사용으로는 [[게리 타롤리]]의 SGI Indigo를 위한 구현이 실리콘 그래픽스와 [[3dfx|3dfx 인터랙티브]]를 거쳤다. Rys Sommefeldt는 Ardent Computer의 Greg Walsh가 [[MATLAB]]을 설계한 [[클리브 몰러]]와의 협의를 거쳐 이 알고리즘을 만들었다고 결론지었다.<ref name="Beyond3Dp2">{{웹 인용|url=http://www.beyond3d.com/content/articles/15/|title=Origin of Quake3's Fast InvSqrt() - Part Two|accessdate=2008-04-19|author=Sommefeldt, Rys|date=2006-12-19|publisher=Beyond3D}}</ref>

== 개요 ==
[[파일:Surface_normal.png|오른쪽|섬네일|위 사진에 나타난 곡면 상의 표면에 수직인 벡터, 즉 [[평면|법선 벡터]]는 조명과 셰이딩을 위한 계산에 광범위하게 사용된다.]]

역 제곱근은 [[단위 벡터]]를 구하는 데 사용되며, [[평면|법선]] 단위 벡터를 사용하여 표면을 향하는 빛의 입사각과 반사각을 결정할 수 있다. 3차원 벡터 <math>\boldsymbol{v}(v_1,\ v_2,\ v_3)</math>의 크기를 [[유클리드 노름]]
: <math>\|\boldsymbol{v}\|=\sqrt{v_1^2+v_2^2+v_3^2}</math>
으로 정하면 이 벡터와 같은 방향의 단위 벡터는 다음과 같다:
: <math>\frac{\boldsymbol{v}}{\|\boldsymbol{v}\|}\left(\frac{v_1}{\sqrt{v_1^2+v_2^2+v_3^2}},\ \frac{v_2}{\sqrt{v_1^2+v_2^2+v_3^2}},\ \frac{v_3}{\sqrt{v_1^2+v_2^2+v_3^2}}\right)</math>
여기서 실수 <math>x</math>에 대한 역 제곱근 <math>\frac{1}{\sqrt{x}}</math>이 등장한다. [[3차원 컴퓨터 그래픽스]]에서는 이러한 연산을 매초 수백만 번 수행해야 했고, 이것은 특수한 하드웨어가 출현하기 전까지 번거로운 작업이었다. 1990년대 초에는 실수의 처리 능력이 정수 처리에 비해서 뒤처져 있었으나, 고속 역 제곱근 알고리즘은 float 실수의 나눗셈을 우회하여 이러한 문제 해소에 도움이 되었다. ''퀘이크 3 아레나''에서 그래픽 연산 속도를 개선하기 위해서 사용되었고, 이후 [[FPGA]]를 사용하는 일부 특수한 하드웨어 [[버텍스 셰이더]]에 구현되기도 했다.

== 코드 ==
다음 코드는 ''퀘이크 3 아레나''에 사용된 것으로, [[C (프로그래밍 언어)|C]] [[전처리]] 지시문을 생략했지만 본래의 정확한 주석까지 포함하였다.<ref name="quakesrc">{{웹 인용|url=https://github.com/id-Software/Quake-III-Arena/blob/master/code/game/q_math.c#L552|title=quake3-1.32b/code/game/q_math.c|work=[[Quake III Arena]]|publisher=[[id Software]]|accessdate=2017-01-21}}</ref>
<syntaxhighlight lang="c">
float Q_rsqrt( float number )
{
	long i;
	float x2, y;
	const float threehalfs = 1.5F;

	x2 = number * 0.5F;
	y = number;
	i = * ( long * ) &y;                     // evil floating point bit level hacking
	i = 0x5f3759df - ( i >> 1 );             // what the fuck?
	y = * ( float * ) &i;
	y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
//	y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed

	return y;
}
</syntaxhighlight>
1990년대 초에 역 제곱근을 계산하는 통상적인 방법은 첫 번째 근사치를 [[순람표]]로 정하는 것이었다. 이 코드는 순람표를 사용하는 것보다 빠르다는 것을 증명하였고, 다른 일반적인 알고리즘보다 네 배 정도 더 빨랐다. 이 코드에 사용되는 상수 0x5f3759df는 그 의미를 즉시 파악하기 어렵기 때문에 이 알고리즘의 [[매직 넘버]]라고 불린다. 이 알고리즘은 [[뉴턴의 방법]]을 사용하여 비교적 정확한 결과를 만들어 내지만, 소수점의 손실 때문에 부정확하고 1999년부터 도입된 [[x86]] [[스트리밍 SIMD 확장|SSE]]의 <code>rsqrtss</code>에 비해 훨씬 느리다.<ref name="ruskin">{{웹 인용 |url=http://assemblyrequired.crashworks.org/timing-square-root/ |title=Timing square root |work=Some Assembly Required |first=Elan |last=Ruskin |date=2009-10-16 |accessdate=2015-05-07 |archive-date=2015-05-18 |archive-url=https://web.archive.org/web/20150518093102/http://assemblyrequired.crashworks.org/timing-square-root/ }}</ref> 또한 <code>long</code>을 사용하는 것은 64비트 시스템에 대한 코드의 이식성을 떨어뜨릴 수 있다.

=== 실행 예시 ===
다음은 <math>x=0.15625</math>에 대해 <math>x^{-1/2}\approx 2.52982</math>를 근사하는 과정이다:
 0_01111100_01000000000000000000000  = 1.25 * 2^-3, Bit pattern of both x and i
 0_00111110_00100000000000000000000  = 1.125 * 2^-65, Shift right one position: (i >> 1)
 0_10111110_01101110101100111011111  = 1.432430... * 2^+63, The magic number 0x5f3759df
 0_10000000_01001110101100111011111  = 1.307430... * 2^+1, The result of 0x5f3759df - (i >> 1)
이 결과는 2.61486으로 약 3.4%의 오차를 갖는다. 뉴턴의 방법을 한 번 사용하면 2.52549로 약 0.17%의 오차를 갖게 된다.

== 알고리즘 설명 ==
이 알고리즘은 다음 단계들을 수행해서 <math>x^{-1/2}</math>를 계산한다:
# float 양수 <math>x</math>를 long [[비트 (단위)|비트]]로 취급하면 <math>\log_2(x)</math>로 다룰 수 있다.
# 매직 넘버를 사용하여 <math>-\log_2(x)/2</math>의 근사치를 계산한다.
# long 비트 <math>\log_2(x^{-1/2})</math>를 다시 float 양수로 취급하면 <math>x^{-1/2}</math>로 다룰 수 있다.
# 뉴턴의 방법을 한 번 사용하여 근사치를 수정한다.

=== 부동소수점 ===
{{본문|IEEE 754}}
이 알고리즘은 부동소수점의 [[IEEE 754|단정밀도]] 표현에 의존하기 때문에, 이에 관한 설명이 필요하다. 한 예로, 십진수 실수 -118.625에 대한 부동소수점의 단정밀도 표현은 다음과 같다:
# 음수이므로 sign은 1이 된다.
# <math>118.625_{10}=1110110.101_{2}</math>이므로 <math>118.625_{10}=(2_{10})^{110_{2}}\times 1.110110101_{2}</math>와 같이 소수점 앞을 1로 만든다.
# exponent는 음수 지수를 고려해서 지수에 127을 더한 <math>110_{2}+1111111_{2}=10000101_{2}</math>가 되고, fraction은 소수점 뒤의 23자리 11011010100000000000000가 된다.
# 정리하면 다음과 같다:
[[파일:Float point example frac.svg|가운데]]
그러므로 float 양수 <math>x</math>를 long 비트로 표현하면 <math>x=2^\mathrm{E}(1+\mathrm{M})</math>에서 <math>\mathrm{E}</math>와 <math>\mathrm{M}</math>을 알 수 있고, 양변에 밑이 2인 로그를 취하면
: <math>\log_2(x)=\mathrm{E}+\log_2(1+\mathrm{M})</math>
이다.

=== 로그 근사 ===
여기서 범위 <math>0\leq\mathrm{M}<1</math>를 고려하여
: <math>\log_2(1+\mathrm{M})\approx\sigma+\mathrm{M}</math>
로 근사할 수 있는 상수 <math>\sigma</math>를 사용한다.
: <math>(\log_2(1+\mathrm{M})-\mathrm{M})'=\frac{1}{\mathrm{M}\ln 2+\ln 2}-1</math>
이고 극댓값은 <math>\mathrm{M}=\frac{1}{\ln 2}-1</math>에서 0.0860713이기 때문에, 오차의 [[고른 노름|최댓값]]이 가장 작은 상수는 그 절반인 <math>\sigma=0.0430357</math>이다.
[[파일:Log_by_aliasing_to_int.svg|오른쪽|섬네일|float 양수에 대한 로그 근사(파란색)와 로그 함수(회색).]]

한편 양수 <math>x</math>를 비트 <math>n_x</math>로 취급하면
: <math>\begin{align}
n_x &=2^{23}(\mathrm{E}+127)+2^{23}\times\mathrm{M}\\
&=2^{23}(\mathrm{E}+\sigma+\mathrm{M})+2^{23}(127-\sigma)\\
&\approx 2^{23}\log_2(x)+2^{23}(127-\sigma)
\end{align}</math>
이다. <math>\sigma=0.0430357</math>를 대입하면
: <math>\begin{align}
n_{x^{-1/2}} &\approx 2^{23}\log_2(x^{-1/2})+2^{23}(127-\sigma) \\
&=3/2\times 2^{23}(127-\sigma)-n_x/2\\
&\approx\mathrm{0x5f37bcb6}-n_x/2
\end{align}</math>
이다. 이것은 다음 코드로 쓰였다:
<syntaxhighlight lang="c">
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
</syntaxhighlight>
매직 넘버 0x5f3759df가 어떻게 유도되었는지는 알려져 있지 않으며, 추론되는 상수는 <math>\sigma\approx 0.0450466</math>이다.

=== 뉴턴의 방법 ===
{{본문|뉴턴 방법}}
함수 <math>f(x)</math>에 대한 <math>(x_0,\ f(x_0))</math>에서의 [[접선]]
: <math>y=f'(x_0)(x-x_0)+f(x_0)</math>
의 [[직선|<math>x</math>절편]]은 <math>x_0</math>보다 방정식 <math>f(x)=0</math>의 해에 근접하며, 이 값을 다시 <math>x_0</math>에 대입하여 방정식의 해를 근사할 수 있다. 위의 단계는 이 방법에 대한 신뢰할 만한 초깃값을 제공하여 함수 <math>f(y)=\frac{1}{y^2}-x</math>에 대한 <math>(y_0,\ f(y_0))</math>에서의 접선
: <math>z=-\frac{2}{{y_0}^3}(y-y_0)+\frac{1}{{y_0}^2}-x</math>
의 <math>y</math>절편
: <math>y_1=y_0\left(\frac{3}{2}-\frac{x}{2}(y_0)^2\right)</math>
으로 방정식 <math>f(x^{-1/2})=0</math>의 해인 역 제곱근을 근사할 수 있다. 이것은 다음 코드로 쓰였다:
<syntaxhighlight lang="c">
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
</syntaxhighlight>

=== 정확도 ===
[[파일:Invsqrt0-10000.svg|오른쪽|섬네일|libstdc의 역 제곱근과 [[휴리스틱 이론|휴리스틱]]적인 고속 역 제곱근의 차이.]]

오른쪽 그래프는 뉴턴의 방법을 한 번 사용하여 개선된 오차를 보여 준다. 0.01에 대해 [[표준 라이브러리]]는 10.0을, 위 알고리즘을 적용한 함수는 9.982522를 반환하며, [[로그 스케일]]에서 서로의 차이는 일정한 영역을 벗어나지 않는다. Chris Lomont는 더 정확하게 근사하는 매직 넘버 0x5f375a86을, Matthew Robertson은 배정밀도 표현에서 유효한 매직 넘버 0x5fe6eb50c7b537a9를 찾아냈다.<ref name="robertson">{{웹 인용|url=https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf|title=A Brief History of InvSqrt|author=Matthew Robertson|date=2012-04-24|publisher=UNBSJ|확인날짜=2017-11-28|archive-date=2016-03-29|archive-url=https://web.archive.org/web/20160329040932/https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf|url-status=}}</ref>

== 같이 보기 ==
* [[매직 넘버 (프로그래밍)]]
== 각주 ==
{{각주}}

== 문서 ==
* {{저널 인용|ref = harv|last = Blinn|first = Jim|title = Floating Point Tricks|journal = Computer Graphics & Applications, IEEE|date = July 1997|volume = 17|issue = 4|doi = 10.1109/38.595279|pages = 80}}
* {{서적 인용|last = Blinn|ref = harv|first = Jim|title = Jim Blinn's Corner: Notation, notation notation|url = https://archive.org/details/jimblinnscornern0000blin|publisher = Morgan Kaufmann|year = 2003|isbn = 1-55860-860-5}}
* {{서적 인용|ref = harv|last = Eberly|first = David|title = 3D Game Engine Design|url = https://archive.org/details/isbn_9781558605930|publisher = Morgan Kaufmann|year = 2001|isbn = 978-1-55860-593-0}}
* {{저널 인용|ref = harv|last = Eberly|first = David|title = Fast and Accurate Inverse Square Root|publisher = Geometric Tools|year = 2015|url = http://www.geometrictools.com/Documentation/FastInverseSqrt.pdf|accessdate = 2015-05-09|보존url = https://web.archive.org/web/20090224214314/http://www.geometrictools.com/Documentation/FastInverseSqrt.pdf|보존날짜 = 2009-02-24|url-status = dead}}
* {{서적 인용|ref = {{harvid|Hennessey|Patterson|1998}}|last = Hennessey|first = John|author2 = Patterson, David A.|title = Computer Organization and Design|publisher = Morgan Kaufmann Publishers|location = San Francisco, CA|year = 1998|edition = 2nd|isbn = 978-1-55860-491-9|url = http://books.google.com/?id=7-ZQAAAAMAAJ}}
* {{저널 인용|ref = harv|last = Kushner|first = David|date = August 2002|title = The wizardry of Id|journal = IEEE Spectrum|volume = 39|issue = 8|pages = 42–47|doi = 10.1109/MSPEC.2002.1021943}}
* {{웹 인용|ref = harv|url = http://www.lomont.org/Math/Papers/2003/InvSqrt.pdf|title = Fast Inverse Square Root|last = Lomont|first = Chris|date = February 2003|accessdate = 2009-02-13}}
* {{웹 인용|ref = harv|url = http://www.daxia.com/bibis/upload/406Fast_Inverse_Square_Root.pdf|title = The Mathematics Behind the Fast Inverse Square Root Function Code|last = McEniry|first = Charles|date = August 2007|accessdate = 2009-02-13|보존url = https://web.archive.org/web/20150511044204/http://www.daxia.com/bibis/upload/406Fast_Inverse_Square_Root.pdf|보존날짜 = 2015-05-11|url-status = dead}}
* {{콘퍼런스 인용|ref = {{harvid|Middendorf|2007}}|last1 = Middendorf|first1 = Lars|last2 = Mühlbauer|first2 = Felix|last3 = Umlauf|first3 = George|last4 = Bodba|first4 = Christophe|date = June 1, 2007|title = Embedded Vertex Shader in FPGA|conference = [[International Federation for Information Processing|IFIP]] TC10 Working Conference:International Embedded Systems Symposium (IESS)|booktitle = Embedded System Design: Topics, Techniques and Trends|editor = Rettberg, Achin|others = et al.|publisher = Springer|location = [[Irvine, California]]|isbn = 978-0-387-72257-3}}
* {{웹 인용|archiveurl = https://web.archive.org/web/20090215020337/http://www.hackszine.com/blog/archive/2008/12/quakes_fast_inverse_square_roo.html|url = http://www.hackszine.com/blog/archive/2008/12/quakes_fast_inverse_square_roo.html|title = Quake's fast inverse square root|last = Striegel|first = Jason|date = December 4, 2008|work = Hackszine|publisher = [[오라일리 미디어]]|archivedate = 2009-02-15|accessdate = 2013-01-07|url-status = dead}}

== 외부 링크 ==
* [https://web.archive.org/web/20140202234227/http://shelfflag.com/rsqrt.pdf A Brief History of InvSqrt] by Matthew Robertson
* [https://web.archive.org/web/20120920120948/http://blog.quenta.org/2012/09/0x5f3759df.html 0x5f3759df], further investigations into accuracy and generalizability of the algorithm by Christian Plesner Hansen
* [http://www.beyond3d.com/content/articles/8/ Origin of Quake3's Fast InvSqrt()]
* [https://github.com/id-Software/Quake-III-Arena Quake III Arena source code]
* {{웹 인용|url = http://www.codemaestro.com/reviews/9|title = Magical Square Root Implementation In Quake III|last = Margolin|first = Tomer|date = 27 August 2005|work = CodeMaestro|publisher = The Coding Experience|확인날짜 = 2012년 5월 15일|보존url = https://web.archive.org/web/20120414130404/http://www.codemaestro.com/reviews/9|보존날짜 = 2012년 4월 14일|url-status = dead}}

[[분류:소스 코드]]
[[분류:수치해석학]]
[[분류:퀘이크 시리즈]]
[[분류:근 찾기 알고리즘]]