优秀网站设计案例分析ppt,网站建设成本计划书,python设计模式,出口贸易网站目录 一、概括二、stream 一、概括
嵌入式开发中对要设计的产品、立项的项目进行设计时#xff0c;往往需要对关键芯片进行性能评估#xff0c;本文主要总结基于linux系统的产品在性能评估时的工具使用总结#xff0c;在aarch64(arm64平台下测试)#xff0c;板卡根文件系统… 目录 一、概括二、stream 一、概括
嵌入式开发中对要设计的产品、立项的项目进行设计时往往需要对关键芯片进行性能评估本文主要总结基于linux系统的产品在性能评估时的工具使用总结在aarch64(arm64平台下测试)板卡根文件系统为debian系统。 工具列表如下
名称作用git源码链接lmbench带宽测评反应时间测评https://github.com/redrose2100/lmbench.gitstream内存带宽每秒通过的字节数测试https://github.com/jeffhammond/STREAM.gitunixbench测试 unix 系统基本性能测试的结果不仅仅只是CPU,内存,或者磁盘为基准,还取决于硬件,操作系统版本,编译器.https://github.com/kdlucas/byte-unixbench.gitcyclictest 和 stress-ng实时性测试压力工具 git clone https://github.com/ColinIanKing/stress-ng.git 测试工具git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
二、stream
1、编译 修改Makefile为以下
CC ? gcc
CFLAGS -O3 -fno-PIC -mcmodellarge -fopenmp -DSTREAM_ARRAY_SIZE200000000 -DNTIMES30all: stream
clean:rm -f stream *.ostream: stream.c$(CC) $(CFLAGS) stream.c -o streamexport CCaarch64-linux-gnu-gcc make 2、将编译后的stream拷贝到嵌入式板卡中 3、运行测试 单线程 export OMP_NUM_THREADS1 ./stream stream-result-1thread.txt
多线程,以8线程为例这里cpu核数为8如果是单核单线程的话最大支持到8 export OMP_NUM_THREADS8 export GOMP_CPU_AFFINITY0-7 ./stream stream-result-8thread.txt 4、运行出现的问题 提示如下错误
./stream: error while loading shared libraries: libgomp.so.1: cannot open shared object file: No such file or directorydebian环境下 dpkg -i libgomp1_8.3.0-6_arm64.deb 再次 buildroot环境下编译时打开对此包编译的选项 5、结果和解释 单线程
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size 200000000 (elements), Offset 0 (elements)
Memory per array 1525.9 MiB ( 1.5 GiB).
Total memory required 4577.6 MiB ( 4.5 GiB).
Each kernel will be executed 30 times.The *best* time for each kernel (excluding the first iteration)will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested 1
Number of Threads counted 1
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 347944 microseconds.( 347944 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 11397.9 0.280894 0.280754 0.281033
Scale: 10245.3 0.312539 0.312339 0.313669
Add: 8855.1 0.542250 0.542060 0.542685
Triad: 8857.6 0.542100 0.541906 0.542925
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------多线程
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size 200000000 (elements), Offset 0 (elements)
Memory per array 1525.9 MiB ( 1.5 GiB).
Total memory required 4577.6 MiB ( 4.5 GiB).
Each kernel will be executed 30 times.The *best* time for each kernel (excluding the first iteration)will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested 8
Number of Threads counted 8
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 339367 microseconds.( 339367 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 14378.2 0.223113 0.222559 0.223810
Scale: 12578.0 0.257082 0.254413 0.260384
Add: 10312.8 0.468002 0.465440 0.470596
Triad: 8938.6 0.542479 0.536994 0.548937
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
说明 关注以下四行的每秒字节数和时间
Function Best Rate MB/s Avg time Min time Max time
Copy: 14378.2 0.223113 0.222559 0.223810
Scale: 12578.0 0.257082 0.254413 0.260384
Add: 10312.8 0.468002 0.465440 0.470596
Triad: 8938.6 0.542479 0.536994 0.548937