中国IT动力,最新最全的IT技术教程
最新100篇 | 推荐100篇 | 专题100篇 | 排行榜 | 搜索 | 在线API文档
首 页 | 程序开发 | 操作系统 | 软件应用 | 图形图象 | 网络应用 | 精文荟萃 | 教育认证 | 硬件维护 | 未整理篇 | 站长教程
ASP JS PHP工程 ASP.NET 网站建设 UML J2EESUN .NET VC VB VFP 网络维护 数据库 DB2 SQL2000 Oracle Mysql
服务器 Win2000 Office C DreamWeaver FireWorks Flash PhotoShop 上网宝典 CorelDraw 协议大全 网络安全 微软认证
硬件维护  CPU  主板  硬盘  内存  显卡  显示器  键盘鼠标  声卡音箱  打印机  机箱电源  BIOS  网卡  C#  Java  Delphi  vs.net2005
  当前位置:> 程序开发 > 编程语言 > C/C++
关于《一个奇怪的函数参数定义及解答》的进一步讨论
作者:未知 时间:2005-09-13 23:29 出处:Blog.ChinaUnix.net 责编:chinaitpower
              摘要:关于《一个奇怪的函数参数定义及解答》的进一步讨论

《一个奇怪的函数参数定义及解答》一文

http://blog.chinaunix.net/article.php?articleId=37822&blogId=5727

中提到

unsigned int str_len(s)
register char *s;
{
  register char *t;

  t = s;
  for (;;) {
    if (!*t) return t - s; ++t;
    if (!*t) return t - s; ++t;
    if (!*t) return t - s; ++t;
    if (!*t) return t - s; ++t;
  }
}

在循环中为何使用四次if (!*t) return t - s; ++t;的问题,我想我关于“可能是流水线”的解释颇有问题,看看comp.lang.c关于此的讨论:

http://groups.google.com/group/comp.lang.c/browse_thread/thread/55936de0a54b279b/b2326ffb85ce03f4?hl=en#b2326ffb85ce03f4

摘录一些如下:

duff's device / loop unriolling
All 19 messages in topic - view as tree
Jan Richter  Aug 19, 5:53 pm     show options
Newsgroups: comp.lang.c
From: "Jan Richter" <vincent00...@yahoo.de> - Find messages by this author
Date: Fri, 19 Aug 2005 11:53:21 +0200
Local: Fri, Aug 19 2005 5:53 pm
Subject: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

Hi there,

the Code below shows DJBs own implementation of strlen (str_len):

unsigned int str_len(char *s)
{
  register char *t;
  t = s;
  for (;;) {
    if (!*t) return t - s; ++t;
    if (!*t) return t - s; ++t;
    if (!*t) return t - s; ++t;
    if (!*t) return t - s; ++t;
  }

}

I understand the code so far, but I have a question about the loops.
To me, it seems he used loop unrolling (aka duff's device) to optimize
the code while it is executed. But why did he use four loops? When
the function is invoked, he didn't know how big "s" is. Or am I wrong here?
I always thought, to unroll a loop I need to know how often the loop is
used.

Any help would be appreciated,
JR

Reply


Giannis Papadopoulos  Aug 19, 6:13 pm     show options
Newsgroups: comp.lang.c
From: Giannis Papadopoulos <ipapa...@inf.uth.gr> - Find messages by this author
Date: Fri, 19 Aug 2005 13:13:19 +0300
Local: Fri, Aug 19 2005 6:13 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

In my linbox, strlen() gives by far better results...

--
one's freedom stops where others' begin

Giannis Papadopoulos
http://dop.users.uth.gr/
University of Thessaly
Computer & Communications Engineering dept.

Reply

- Show quoted text -

Jan Richter  Aug 19, 6:31 pm     show options
Newsgroups: comp.lang.c
From: "Jan Richter" <vincent00...@yahoo.de> - Find messages by this author
Date: Fri, 19 Aug 2005 12:31:01 +0200
Local: Fri, Aug 19 2005 6:31 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

"Giannis Papadopoulos" wrote:

[...]

> In my linbox, strlen() gives by far better results...

Yes, same for me, so where is the benefit?

Cheers,
JR

Reply


Giannis Papadopoulos  Aug 19, 6:45 pm     show options
Newsgroups: comp.lang.c
From: Giannis Papadopoulos <ipapa...@inf.uth.gr> - Find messages by this author
Date: Fri, 19 Aug 2005 13:45:55 +0300
Local: Fri, Aug 19 2005 6:45 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

No benefit... Maybe it is written for a compiler that does not know how
to unroll loops...

And yes, loop unrolling works perfectly when you have a hint how many
times this loop will be executed.

Doesn't the author of this peculiar str_len() say anything?

--
one's freedom stops where others' begin

Giannis Papadopoulos
http://dop.users.uth.gr/
University of Thessaly
Computer & Communications Engineering dept.

Reply

- Show quoted text -

kerne...@hotmail.com  Aug 19, 6:59 pm     show options
Newsgroups: comp.lang.c
From: kerne...@hotmail.com - Find messages by this author
Date: 19 Aug 2005 03:59:32 -0700
Local: Fri, Aug 19 2005 6:59 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Remove | Report Abuse

 Jan Richter  wrote:

Reply

>But why did he use four loops? When
>the function is invoked, he didn't know how big "s" is.

may be based on the consideration of pipeline.
I heard Intel PII CPU has four levels  pipelines of integer operation.
is that right?

Richard Bos  Aug 19, 7:02 pm     show options
Newsgroups: comp.lang.c
From: r...@hoekstra-uitgeverij.nl (Richard Bos) - Find messages by this author
Date: Fri, 19 Aug 2005 11:02:59 GMT
Local: Fri, Aug 19 2005 7:02 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

"Jan Richter" <vincent00...@yahoo.de> wrote:
> "Giannis Papadopoulos" wrote:

> > In my linbox, strlen() gives by far better results...

> Yes, same for me, so where is the benefit?

The original Duff's Device did a bit more than merely count. It's in the
FAQ, btw, which you should've read.
<
http://www.eskimo.com/~scs/C-faq/top.html>.

Richard

Reply


Netocrat  Aug 19, 7:17 pm     show options
Newsgroups: comp.lang.c
From: Netocrat <netoc...@dodo.com.au> - Find messages by this author
Date: Fri, 19 Aug 2005 21:17:55 +1000
Local: Fri, Aug 19 2005 7:17 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

As someone else has said most optimising compilers will do this for you.
It's a far more appropriate technique for assembly.  But perhaps the
author had a specific implementation in mind for which this happened to
work best.  Also as pointed out elsewhere Duff's Device is a more specific
term than loop unrolling.

> But why did he use four loops? When the function is invoked, he didn't
> know how big "s" is. Or am I wrong here? I always thought, to unroll a
> loop I need to know how often the loop is used.

There are two main considerations: speed vs code size.  

More unrolling == less jump opcodes == more loop body code.

Which is equivalent to a speed increase at the cost of a code size
increase.

If you know how often the loop body will be iterated on average then you
can avoid unrolling the loop by more than that number.  Hence you avoid
the extra code without affecting the speed increase.

If you don't know, then you make a decision as to how much extra code
you're willing to tolerate.  There are other issues like keeping all the
loop body code in the instruction cache and possibly others that I'm not
too familiar with not being an assembly coder.

For the strlen function as you point out you can't know what the average
number of iterations will be so the only consideration is code size.
Providing you can keep it all in the instruction cache the loop could be
unrolled indefinitely and continue to provide speed increases for calls
with a sufficiently long string.

Why did the author choose four?  Who knows - perhaps he wanted the
function to be reasonably small whilst still providing a modest speed
increase.  Which as you and others point out, it doesn't.  Going to show
once again that usually optimisations are best left to the compiler or
assembly.

--
http://members.dodo.com.au/~netocrat

Reply

- Show quoted text -

pete  Aug 19, 8:07 pm     show options
Newsgroups: comp.lang.c
From: pete <pfil...@mindspring.com> - Find messages by this author
Date: Fri, 19 Aug 2005 12:07:48 GMT
Local: Fri, Aug 19 2005 8:07 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

Why only four if statements in the loop?

I guess that's not supposed to be an example of portable code.
The (t-s) expressions are of type ptrdiff_t,
which is unsuitable for use that way, in portable code.

That's not Duff's Device.
http://www.lysator.liu.se/c/duffs-device.html

--
pete

Reply

- Show quoted text -

Christopher Benson-Manica  Aug 19, 10:19 pm     show options
Newsgroups: comp.lang.c
From: Christopher Benson-Manica <a...@nospam.cyberspace.org> - Find messages by this author
Date: Fri, 19 Aug 2005 14:19:58 +0000 (UTC)
Local: Fri, Aug 19 2005 10:19 pm
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

Giannis Papadopoulos <ipapa...@inf.uth.gr> wrote:
> No benefit... Maybe it is written for a compiler that does not know how
> to unroll loops...

Probably; an implementation simple-minded enough to trust the
programmer when he uses the "register" storage class specifier
probably could use some help unrolling a loop.  The VAX compiler for
which Duff originally wrote his device was (presumably) such an
implementation.

--
Christopher Benson-Manica  | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org    | don't, I need to know.  Flames welcome.

Reply


Keith Thompson  Aug 20, 5:27 am     show options
Newsgroups: comp.lang.c
From: Keith Thompson <k...@mib.org> - Find messages by this author
Date: Fri, 19 Aug 2005 21:27:32 GMT
Local: Sat, Aug 20 2005 5:27 am
Subject: Re: duff's device / loop unriolling
Reply | Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

Duff's device is a specific technique, not a synonym for loop
unrolling.  See question 20.35 of the C FAQ.

The use of unsigned int is questionable.  The return type should be
size_t.  It's possible that this implementation predates the existence
of size_t (though it's been modernized with a prototype).  I would
guess that it might be more efficient than strlen() only on a very old
implementation.

A matter of terminology: it doesn't have four loops.  It has a single
loop containing four if statements.

The statement that's executed N times is

    if (!*t) return t - s; ++t;

N needn't be a multiple of 4 because it can break out of the loop
partway through the body of the for loop.

In the following, I'll use the term "iteration" for an execution of
the entire body of the loop, and "sub-iteration" for an execution of a
single line within the loop.

The idea of loop unrolling is that it avoids the overhead of a test on
each iteration, falling through from one statement to the next and
performing the test and branch only once every 4 elements.  The number
of times a loop is to be unrolled is a tradeoff between speed and code
size -- and if it's unrolled too much (say, 1024 times), the code size
itself can make it run slower due to cache issues.  This is all
*extremely* system-specific, which is why the whole thing is best left
to the compiler.

Furthermore, the "beauty" of Duff's Device is that it jumps into the
middle of the loop, so the loop never has to exit from the middle, and
it really does avoid the overhead on each sub-iteration.  The code
above doesn't do this; instead, it performs a test and branch on each
sub-iteration.

I'd be surprised to see the above code performing better than strlen()
on any modern implementation, particularly since an implementation is
free to implement strlen() using whatever non-portable tricks it likes
to squeeze out the last clock cycle.

--
Keith Thompson (The_Other_Keith) k
...@mib.org  <http://www.ghoti.net/~kst>
San Diego Supercomputer Center             <*>  <
http://users.sdsc.edu/~kst>
We must do something.  This is something.  Therefore, we must do this.

关闭本页

 
首页 | 投资与合作 | 服务条款 | 隐私政策 | 收藏本站 | 设为首页 | 新用户注册 | 免责声明 | 使用帮助
Copyright ©2005-2008 chinaitpower.com All rights reserved. www.chinaitpower.com 版权所有