The Sprite Network Operating System
Date: October 22, 2007
(1)
1.What is the problem the authors are trying to
solve?
In more and more research and engineer-ing organizations, computing occurs on
personal
workstations connected by local-area networks, with larger time-shared machines
used
only for those applications that cannot achieve acceptable performance on
workstations.
The environments tend to suffer from poor performance and difficulties of
sharing and
administration, due to the distributed nature of the systems.
Today’s engineering workstations typically contain 4 to 32 Mbytes of physical
memory,
and the authors expect memories of 100-500 Mbytes to be commonplace within a few
years.
The authors hope that Sprite will facilitate the development of multiprocessor
applications,
and that the operating system itself will be able to take advantage of multiple
processors
in providing system ser-vices.
2.What other approaches or solutions existed at the time that this work was
done?
For the most part, Sprite’s kernel calls are similar to those provided by the
4.3 BSD version
of the UNIX operating system. However, we added three additionalfacilities
toSprite in order to
encourage resource sharing.
3.What was wrong with the other approaches or solutions?
One of Sprite’s primary goals is to support applications running on SPUR
workstations, the authors
hope that the system will work well for a variety of high-performance
engineering workstations.
4.What is the authors' approach or solution?
The RPC mechanism is used extensively in Sprite to implementother features, such
as the network
filesystem and process migration.
Sprite uses a simple mechanism called prefix tables to manage the name space;
these dynamic struc-
tures facilitate systemadministrationand reconfiguration.
To achieve high performance in the file system, and also to capitalize on large
physical memories,
Sprite caches file data both on server machines and client machines.
Sprite retains the code segments for programs in main memory even after the
programs complete, in
order to allow quick start-up when programs are reused. Lastly, the virtual
memory system negotiates
with the file system over physical memory.
Sprite guarantees that processes behave the same whether migrated or not. This
is achieved by
designating a home machine for each process and forwarding location-dependent
kernel calls to
the process’s home machine.
5.Why is it better than the other approaches or solutions?
One of Sprite’s primary goals is to support applications running on SPUR
workstations, the authors
hope that the system will work well for a variety of high-performance
engineering workstations.
6.How does it perform?
The implementation of RPC consists of stubs and RPC transport.Together they hide
the fact that the
calling procedure and the called procedure are on different machines. For each
remote call there are
two stubs, one on the client workstation and one on the the process’s home
machine.
In Sprite, we use a more dynamic approach to managing the domain structure,
which we call prefix tables.
The kernel of each client machine maintains a private prefix table. Each entry
in a prefix table corresponds
to a domain; it gives the full name of the top-level directory in the domain,
the name of the server on
which that domain is stored, and an additional token to pass to the server to
identify the domain.
Sprite’s caches use a consistency protocol that allows applications on different
workstations to share files
just as if they were running ona singletime-sharing system.
Sprite’s implementation of virtual memory is traditional in many respects. For
example, it uses a variation
on the ‘‘clock’’ algorithm for its page replacement mechanism, and provides
shared read- write data segments
using a straightforward extension of the mechanism thatprovides shared read-only
code in time-shared UNIX.
The simplestapproach toprocess migrationis:
[1] ‘‘Freeze’’the process (prevent itfrom executing any more).
[2] Transfer its state to the new machine, including registers and other
execution state, virtual
memory, and fileaccess.
[3] ‘‘Unfreeze’’the process onitsnew machine, sothatitmay continue executing.
7.Why is this work important?
Sharing is what users want, in order to work cooperatively and make the best use
of the hardware
resources.
Flexibility is what system administrators want, in order to allow the system to
evolve gracefully.
Finally, performance is what everyone wants.
The three above is important and Sprite provided.
8.Can any improvement be done?
No
(2)
Q1. What is the problem the authors are trying to
solve?
A: The authors' want to design a high performance multiprocess workstation and
network transparency.
Q2. What other approaches or solutions existed at the time that this work was
done?
A: Berkeley Unix ,4.2 BSD ,CMU-ITC's Andrew Sun's NFS ...etc.
Q3. What was wrong with the other approaches or solutions?
A: Most of Operating systems have lower performance , diffcult ot share ,and
administration.
The earlier file systems only allow remote files execute special programs such
as rcp , most of programs
could only access at local disk.
And earlier Unix system didn't permit memories be shared between processes.
Q4. What is the authors' approach or solution?
A: The authors' designed a new network operating system called Sprite Network
Operating System that
is similar to Unix.
Sprite has a RPC protocal which allows each workstation operates on the other
workstations.
Sprite uses Multi-threaded and allows sharing datas between memories such as
static data ,dynamic
data ,and code.
Q5. Why is it better than the other approaches or solutions?
A: Sprite has high performance because it is using RPC protocal for
communication of kernels and flexible
cache programs and files.
Q6. How does it perform?
A: They ran a series of benchmark programs on the Sun-3/37workstations to
measure caching.
All I/O and paging traffic was measured on a Sun-3/180 file server.
But the authors' didn't measure cache consistency because benchmarks couldn't
share files.
Q7. Why is this work important?
A: The sprite network operating system provides a simpler administration and
perform efficiently
Q8. Can any improvement be done?
A: No.
(3)
1. What is the problem the authors are trying to
solve?
Author 想要做出一個 Distributed System , 可以做 Process Migration 和 較好的 transparency.
2. What other approaches or solutions existed at the time that this work was
done?
那時存在 LOCUS , Andrew , 4.2BSD.
3. What was wrong with the other approaches or solutions?
4.2BSD 只允許某些程式用去呼叫遠端的程式.
Apollo's Aegis 的檔名要註明是哪個主機上的檔名.
Andrew , NFS 用起來感覺好像在用自己的電腦,可是不能使用遠端的裝置.
4. What is the authors' approach or solution?
Author 用個 prefix 來做檔案的管理,每個電腦管理依個 domain, 所以你要用別的電腦的檔案,
感覺好像是用你的硬碟的檔案.
Author 的 Process Migration 是把你要做 Migration 的 Process 的 dirty Memory , 用到的
file , 環境變數, 都移到別的 workstation 去,所以 user 的程式可能在別的電腦執行,可是執行
的結果,和在自己的電腦執行是一樣的,不因為 workstation 的不同而有所差異.
5. Why is it better than the other approaches or solutions?
因為別的方法 transparency 不夠,要用到別的電腦的資源,還要另外設參數之類的.
6. How does it perform?
7. Why is this work important?
因為 user 就更不需要考慮別的 workstation 和自己的 workstation 有什麼差異,user 頂多只要自己
的 workstation 怎麼用就好了,寫程式也不用考慮是否會被移到別的 workstation,要考慮的,Sprite 都
考慮了.
8. Can any improvement be done?
如果 workstation 彼此溝通頻繁,則底層可以改走別的路線,可用 光纖FDDI (MTU = 4352) 或
是改用 ATM (MTU = 9180) ,比 Ethernet (MTU = 1500) 高,就可以不用切太多封包.
(4)
Q1.What is the problem the authors are trying to
solve?
A:The authors' goal is to support applications running on SPUR workstations,and
they hope that the system will work well for a variety of high-performance
engineering workstations.
Q2.What other approaches or solutions existed at the time that this work was
done?
A:
1.The LOCUS Distributed System Architecture.
2.Scale and Performance in a Distributed File System.
3.Mach: A New Kernel Foundation for UNIX Development.
4.CMU-ITC’s Andrew and Sun’s NFS.
Q3.What was wrong with the other approaches or solutions?
A:The early versions of UNIX allowed remote file access only with a few special
programs,and did not permit memory to be shared between user processes.Andrew
and NFS do not permit applications access I/O devices on other machines.
Q4.What is the authors' approach or solution?
A:The authors' approach is the Sprite network operating system.Sprite is a
operating
system for networked uniprocessor and multiprocessor workstations.
Q5.Why is it better than the other approaches or solutions?
A:
1.It is allows the kernel of each workstation to invoke operations on other
workstations.
2.It is implemented as collection of domains on different server machines.
3.It caches file data both on server machines and client machines.
4.The virtual memory system uses ordinary files for backing storage.
5.It guarantees that processes behave the same whether migrated or not.
Q6.How does it perform?
A:The Sprite perform are listed in the following:
1.User processes execute "trap" instructions to switch to supervisor state.
2.The kernel executes as a privileged extension of the user process.
3.Using a small per-process kernel stack for procedure invocation within the
kernel.
Q7.Why is this work important?
A:The Sprite can move a process or group of processes to an idele machine,it
take
advantage of performance of workstations.
Q8.Can any improvement be done?
A:No.
(5)
Q1. What is the problem the authors are trying to
solve?
A:
作者認為將來的電腦技術將會朝向三個方向發展:網路、大容量的記憶體、多重處理器。
因此建立一個新的作業系統,能更容易的管理分散式的系統環境,能管理大量的記憶體,
能提供多重處理器的應用環境及得到多重處理器的優點。
Q2. What other approaches or solutions existed at the time that this work was
done?
A:
- 網路檔案系統
LOCUS:是最早達成transparency的檔案系統之一。
Andrew,NFS:可調整transparency的檔案系統。
- 分享位址空間
部分的UNIX檔案系統允許在不同使用者間分享記憶體的讀寫,
包含System V, SunOS, Berkeley UNIX, and Mach。
Q3. What was wrong with the other approaches or solutions?
A:
- 網路檔案系統
作者提到在近代網路檔案系統的最終目標是network transparency。
近代支援name transparency的網路檔案系統有LOCUS,Andrew,NFS,而其中Andrew,NFS只有部分
的file system hierarchy可被分享,也不允許應用在機器存取其他機器的I/O裝置。
- 分享位址空間
早期的一些UNIX版本不允許記憶體在使用者之間做讀取以外的分享。
Q4. What is the authors' approach or solution?
A:
作者以UNIX為核心,做了以下改進
- 核心包含remote procedure call(RPC)裝置,用來和其他工作站進行資料交流或處理。
- 檔案系統利用prefix tables來管理name space,這樣的結構幫助系統管理。
- 利用cache的方法來增進檔案系統的效能,Sprite 將cache的檔案放在client及server端,
並用簡單的機制來保持檔案的一致性。
- virtual memory用來保存在實體記憶體中被交換的pages,sticky segments的機制保留程式的pages
用來減少正在執行的程式要求資料的頻率。
- Sprite執行process migration不同於其他系統有二個主要的方法。
一是讓process的virtual memory在機器之間被傳送
二是migration被建造來transparent到migrated process。
Q5. Why is it better than the other approaches or solutions?
A:
- Sprite OS允許在網路上的I/O裝置及磁碟可以被不同的機器共享
- virtual memory的機制允許實體記憶體在同一台機器上可以被不同的處理程序共享,
因此在多處理器的條件下可以得到更高的效能。
- 處理程序可以被放到閒置的工作站上執行,藉此分享處理的能力。
Q6. How does it perform?
A:
在使用幾M cache的情況下比較,diskless workstations的效能僅能達到workstations
with local disk的1%到12%,在沒有使用cache的情況下,diskless workstations的速度比
workstations with local disk的速度慢了10%到40%。
另外,cache也和能支援client的數量有關,在有client cache的情況下,系統可以支援
較多的client。
Q7. Why is this work important?
A:
為因應未來電腦技術發展的趨勢,作業系統的修改是必要的,有一個良好的管理機制
才可以使得硬體的效能發揮到極致。
Q8. Can any improvement be done?
A: No
(6)
Q1.What is the problem the authors are
trying to solve?
A:
The authors try to build an operation system to solve several problems
encountered by other operation system on that time.
Kernel: They modified the kernel to handle the multi-threads processes.(And this
approach will improve the performance on multi-processor system)
Process Migration: Processes could travel among workstations on the network to
share the system loading.
RPC: They try to off the workload to other idle remote workstations .Program has
to communicate with other machines in the network.
Authors modified the model of RPC to handle such needs.
File system: There were few approaches about the network file system. But
authors try to resolve some inconveniences on their file system.
Memory: For using the memory more efficiently, authors plans to let the
processes (Parent and child processes) to share the data stack.
Q2.What other approaches or solutions existed at the time that this work was
done?
A:
LOCUS,Andrew,NFS are similar network file system.
Q3.What was wrong with the other approaches or solutions?
A:
Other approaches have some non-transparent file name.
Applications running on Andrew and NFS could only access the local machine's I/O
device.
Other approaches are only able to initiate new processes on other machines not
to move processes among them.
Q4.What is the authors' approach or solution?
A:
File Name Space: Authors create a prefix table to manage these files on separate
workstations network. Any change will broadcast to the network, and the client
will record the change in the prefix table.
RPC: Authors built two components “stubs” and “RPC transport” to handle the
transfer of arguments and results among machines in the network.
Multi-threading: Most operation system in that period (1980s) were
single-threaded. It was simple to implement the kernel but will limit system
performance in the multi-processors workstations.
Process migration: First, processes will be freeze then the states (register,
execution state) of these processes will be transfer to new workstations .New
workstation will unfreeze these processes and continue executing.
Q5.Why is it better than the other approaches or solutions?
A:
All these network file system have the same purpose, “share”.
But some architecture of these systems exist few defects.
This approach could improve these issues for end-users.
And this approach raise the idea “multi-threading” to improve system performance
that other systems lack of.
Q6.How does it perform?
A:
There are only two table described in this paper about the performance of this
approach.
1. They could save the execution time by processes migration.
When four processes executed at one machine, it will take about 60 seconds but
it will take only 20 seconds when we migrated these processes to other
workstation in the LAN.
2. Client cache will impact the system performance.
By the result, without client caching, network traffic will increase and could
be the bottleneck of the system.
Q7.Why is this work important?
A:
This paper tries to improve the operation system in many ways.
And the combine result of these efforts will enhance the system performance
dramatically.
Q8.Can any improvement be done?
A:
Author states that the arguments and results will be transfer in the network in
clear text.
I thought this will be a problem in present day (Though encrypt these messages
is another cost).
(7)
1.What is the problem the authors are trying to
solve?
由於有越來越多的單位,開始試著用LAN,將各個PC連到工作站,雖然PC連到工作站的數量很多,但卻
發生一個問題,就是沒有一個很好的管理機制來share工作站的執行效能,因為作者認為未來的工作站
執行效能會越來越高,不好好利用太可惜了,所以希望提出Sprite這個方法來發展一個多處理器的application
,以便充份利用工作站的執行效能。
2.What other approaches or solutions existed at the time that this work was
done?
LOCUS, Andrew, NFS,和Sprite一樣,提供name transparency服務,檔案的位置不需要特別指定
3.What was wrong with the other approaches or solutions?
Andrew and NFS只能分享部份的file system hierarchy,此外,Andrew and NFS不能允許本機端的應用程式
去存取遠端電腦的IO設備
4.What is the authors' approach or solution?
LocatingaFile
Managing Prefix Tables
5.Why is it better than the other approaches or solutions?
因為Sprite提供完全的transparency服務,一個應用程式可以在不同的工作站執行,就像是在一個分時系統上
執行的一樣,而且一個檔案hierarchy在多個工作站中可以做到一致性的存取,任何的應用程式皆可以像在存
取本機端檔案一樣,存取遠端的檔案,而且也提供可以存取遠端IO device。
6.How does it perform?
根據本文作者的執行數據,Sprite比Sun's NFS還要快30%
7.Why is this work important?
因為sprite的關係,使用者可以透過網路分享資料給其他人,也就是說可以充份應用硬體資源,也因為如此,整體上的執
行速度也變快了
8.Can any improvement be done?
No
(8)
Q1. What is the problem the authors are trying to
solve?
為了應付硬體變革的趨勢:越來越快的網路, 越來越大的記憶體, 越來越多的cpu, 作業系
統必有所改變. 現今的local-area networks因為系統分散的本質, 導致效能不好, 與難
以分享與管理. 作者認為有必要將這些分散加以隱藏, 使它看起來像個time-shared
machine.
Q2. What other approaches or solutions existed at the time that this work was
1.檔案系統
LOCUS, Andrew, NFS
2. memory share
unix
3. process migration
rsh command of BSD, rex facility of Sun's UNIX
4. RPC
傳統RPC
5.file name space
most of derivatives
Q3. What was wrong with the other approaches or solutions?
1.檔案系統
Andrew和NFS不允許在其他機器上做disk I/O, LOCUS和Sprite相近
2. memory share
只做到share code
3. process migration
只能觸發對方執行程序, 而非將自身程序給對方執行
4. RPC
需要特化compiler為需要的程式做stub
5.file name space
開機就要讀特定區域, 無法動態掛載
Q4. What is the authors' approach or solution?
主要:
1. transparent network file system
2. a simple mechanism for sharing writable memory between processes on a single
workstation
3. a mechanism for migrating processes between workstations in order to take
advantage of idle machines.
細則:
1. kernel建入RPC功能
2. file system使用collection of domains, 利用prefix table, 每台機器可以把自己動態掛入
3. client和server都cache file, 讓無論如何執行都像在同一台機器執行一樣(? 是把client一份複製
到server的main memory嗎? )
4. virtual memory system:每個work station為自己的process創造backing file(到硬碟?記憶體?)
把code segment放在main memory(哪一台的?), 結束也不清除, 以加速
讓virtual memory system 與file system透過physical memory溝通
5. 每個process無論有沒有分派到別的機器執行, 都保證結果和在本機執行一樣
資源:
1.整個網路所有硬碟所有process皆能操作
2.記憶體(整個網路所有的?)可以被同一個machine的process共享
Q5. Why is it better than the other approaches or solutions?
以下較好的理由只有一個, 作者認為看作單一系統來處理效能最佳
1.檔案系統
Sprite讓任何process可以access任何裝置
2.memory share
不只code, 連data都share, 因為multi processor之間要溝通最快就是share
3. process migration
在遠端執行的程序如同在本機執行, 使用者將看不出差別
4.RPC
kernel內建stub, 使用者將不會發現自己在使用rpc, 為了機器間溝通有效率, 使用implicit acknowledgements
和fragmentation. 前者為只收request和等response, 沒收到response就重發, 後者是當request沒辦法fit-in
一個封包時, 把它切段.
5.file name space
用prefix table宣告我上曾有哪些, 並用broadcast的方法告知其他人, 以實現動態掛載
Q6. How does it perform?
無明顯範例和其他分散系統比較
Q7. Why is this work important?
因為如果沒有發揮多台電腦的效能, 是十分殘念的
Q8. Can any improvement be done?
作者好像暗示有一台超強的master server做為主server,才有辦法調配(?), 像檔案系統的掛載
, 和別台電腦調用virtual memory裡面的backing file(沒有主server, 怎麼可能調得到?)
, 但文中似乎沒明講. 在分享記憶體的部分, 作者說的似乎是能力缺乏的client端, 以和網路
檔案調用相同的方式, 來分享超級主機的記憶體. 但如果網路瓶頸的代價比硬碟存取的代價高,
不知道會有什麼結果?