mylist = list(set(mylist))

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

For Existing Member Sign In

推荐学习书目

Learn Python the Hard Way

Python Sites

PyPI - Python Package Index

http://diveintopython.org/toc/index.html

Pocoo

值得关注的项目

PyPy

Celery

Jinja2

Read the Docs

gevent

pyenv

virtualenv

Sentry

Shovel

Pyflakes

pytest

Python 编程

pep8 Checker

Styles

PEP 8

Google Python Style Guide

Code Style from The Hitchhiker's Guide

This topic created in 4126 days ago, the information mentioned may be changed or developed.

去除 List 中重复元素

mylist

List

set

7 replies 2015-01-14 23:45:19 +08:00

9hills

Jan 11, 2015

set下，list顺序就变了。你再list也没用

保持顺序：
a = set()
l2 = [i for i in l1 if not (i in a or a.add(i))]

dant

Jan 11, 2015 via iPhone

话说有没有 orderedset （

4mrqn07k

Jan 11, 2015

@dant 集合是三大特性之一，无序性

zergling

Jan 11, 2015

或者：
from collections import OrderedDict
l2 = OrderedDict(zip(l1, l1)).keys()

deepurple

Jan 11, 2015

刚看那个董伟明的《Python高级编程》的视频里讲到了这个

raquelken

Jan 12, 2015

@zergling 用dict的key去重必须是list元素是可以hashed的，而且还有这个方法 {}.fromkeys(mylist).keys()

dongweiming

Jan 14, 2015

@deepurple 我来填坑: 标准答案是这里 https://docs.python.org/2/faq/programming.html#id40

以上都是一些trick

In [1]: mylist =['a', 'b', 'r', 'a', 'b', 'k', 'v', 'b']

In [2]: %timeit list(set(mylist))
The slowest run took 7.63 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 781 ns per loop

In [3]: %timeit {}.fromkeys(mylist).keys()
The slowest run took 4.30 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 887 ns per loop

In [4]: from collections import OrderedDict

In [5]: %timeit OrderedDict.fromkeys(mylist).keys()
The slowest run took 11.18 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 18.5 s per loop

In [6]: %timeit list({}.fromkeys(mylist))
The slowest run took 5.99 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 995 ns per loop